# SWIFT-HEP / GridPP Workshop - April 2025

## Caching

The Dirac Client is introduced here.
Functionally it works the same as the dask.distributed.Client, but allows for persistent caching.

The following cache locations are supported:
- `local`: to set the directory use `file:///path/to/cache`

Caching options in the works;
- `rucio`: to set the directory use `rucio:///path/to/cache`
- `dirac`: to set the directory use `dirac:///path/to/cache`

In [None]:
from dask_dirac import DiracClient, DiracCluster
from dask.distributed import LocalCluster, Client
import dask.array as da

In [None]:
cluster = DiracCluster(scheduler_options={"port": 8786},)

In [None]:
client = Client(cluster)

In [None]:
client.scheduler_info()

In [None]:
cluster = LocalCluster(n_workers=1)

In [None]:
client = DiracClient(cluster, cache_location="file:///tmp/dask-cache_05022025")
# client = Client(cluster)

In [None]:
# Check the cache location and show what files are there
print(client.cache_location)
!ls {client.cache_location[7:]} # remove file:// at the beginning

In [None]:
client

In [None]:
# Create a Dask DataFrame directly
dask_array = da.ones((1e4, 1), chunks=(1)) + 20231
#dask_array.visualize()
dask_array

In [None]:
result = client.compute(dask_array)

In [None]:
result.result()

In [None]:
# Check the cache location and show what files are there
print(client.cache_location)
!ls {client.cache_location[7:]} # remove file:// at the beginning

## GPU vs CPU

This is an LUX-ZEPLIN analysis which builds a model of multi-scatter-single-ionisation (MSSI) events from simulated events.
This simulated events are from detector components. 
In this analysis, the simulations (ROOT files) are read using `uproot`, and then events are looped over, selecting MSSI events.
The simulated events here have already gone through a pre-processing so only events classified as single-scatter events are considered.

A more detailed step-by-step description of the analysis is as follows:
1. Simulations of detector components are stored as ROOT files.
2. These files are read using `uproot` into `awkward` arrays.
3. A selection is applied to the data to select MSSI events.
4. A normalization is applied to get the expected rate of these events.
5. Something about building the model.


In addition to the above, this analysis also highlights function decorations with numba for CPU and GPU acceleration.

In [1]:
import awkward as ak
import numpy as np
from dask.distributed import LocalCluster, Client, progress
import glob
import pandas as pd
import uproot as up
import numba
import dask

Define the processing

In [2]:
USE_NUMBA_GPU = False
if USE_NUMBA_GPU:
    import numba.cuda
    import math
    
    @numba.cuda.jit(device=True)
    def evaluate_poly(coeffs, x):
        result = 0.0
        for c in coeffs:
            result = result * x + c
        return result
    
    @numba.cuda.jit()
    def loop_over_events(ss_x, ss_y, ss_driftTime_ns, ss_correctedS1Area, ss_correctedS2Area, 
                        mc_nRQMCTruthVertices, mc_volumeName, mc_detectedS1Photons, mc_detectedS2Photons, 
                        is_mssi, is_FV_mssi, is_FV_ROI_mssi, is_FV_ss, is_FV_ROI_ss):
        i = numba.cuda.grid(1)  # get thread index
        if i >= ss_x.shape[0]:  # boundary check
            return

        wall_poly_coeffs = [-8.14589334e-14, 2.09181587e-10, -2.06758029e-07,
                            1.01366014e-04, -2.69048354e-02, 7.24276394e+01]

        nS1 = 0
        nS2 = 0
        r = math.sqrt(ss_x[i] ** 2 + ss_y[i] ** 2)
        drift_time = ss_driftTime_ns[i] / 1000.0
        boundary_r = evaluate_poly(wall_poly_coeffs, drift_time) - 3

        for j in range(mc_nRQMCTruthVertices[i]):
            if mc_volumeName[i][j] == 0:  # Placeholder check, as string comparison isn't allowed in CUDA
                continue
            if mc_detectedS1Photons[i][j] > 0.:
                nS1 += 1
            if mc_detectedS2Photons[i][j] > 0.:
                nS2 += 1

        if nS1 > nS2:
            is_mssi[i] = 1
            if r < boundary_r and 71. < drift_time < 1030.:
                is_FV_mssi[i] = 1
                if 3 < ss_correctedS1Area[i] < 600 and math.log10(ss_correctedS2Area[i]) < 4.5 and ss_correctedS2Area[i] > 14.5 * 44.5:
                    is_FV_ROI_mssi[i] = 1

        if r < boundary_r and 71. < drift_time < 1030.:
            is_FV_ss[i] = 1
            if 3 < ss_correctedS1Area[i] < 600 and math.log10(ss_correctedS2Area[i]) < 4.5 and ss_correctedS2Area[i] > 14.5 * 44.5:
                is_FV_ROI_ss[i] =  1

    def process_file(file):
        branches = ['ss.correctedS1Area_phd', 'ss.correctedS2Area_phd', 'ss.s1Area_phd', 'ss.s2Area_phd', 
                    'ss.x_cm', 'ss.y_cm', 'ss.driftTime_ns']
        mcBranches = ['mcTruthVertices.nRQMCTruthVertices', 'mcTruthVertices.volumeName', 
                    'mcTruthVertices.detectedS1Photons', 'mcTruthVertices.detectedS2Photons', 
                    'mcTruthEvent.eventWeight']

        tfile = up.open(file)
        t = tfile['Scatters']
        mct = tfile['RQMCTruth']

        ss = t.arrays(branches, library="np")
        mc = mct.arrays(mcBranches, library="np")

        num_events = ss['ss.correctedS1Area_phd'].shape[0]

        # Allocate device arrays
        is_mssi = numba.cuda.device_array(num_events, dtype=np.int32)
        is_FV_mssi = numba.cuda.device_array(num_events, dtype=np.int32)
        is_FV_ROI_mssi = numba.cuda.device_array(num_events, dtype=np.int32)
        is_FV_ss = numba.cuda.device_array(num_events, dtype=np.int32)
        is_FV_ROI_ss = numba.cuda.device_array(num_events, dtype=np.int32)

        # Convert Awkward arrays to NumPy
        ss_x = numba.cuda.to_device(ss['ss.x_cm'])
        ss_y = numba.cuda.to_device(ss['ss.y_cm'])
        ss_driftTime_ns = numba.cuda.to_device(ss['ss.driftTime_ns'])
        ss_correctedS1Area = numba.cuda.to_device(ss['ss.correctedS1Area_phd'])
        ss_correctedS2Area = numba.cuda.to_device(ss['ss.correctedS2Area_phd'])
        mc_nRQMCTruthVertices = numba.cuda.to_device(mc['mcTruthVertices.nRQMCTruthVertices'])
        mc_detectedS1Photons = numba.cuda.to_device(mc['mcTruthVertices.detectedS1Photons'])
        mc_detectedS2Photons = numba.cuda.to_device(mc['mcTruthVertices.detectedS2Photons'])

        # Handle strings in mc['mcTruthVertices.volumeName'] by converting to integers before passing to CUDA
        mc_volumeName = numba.cuda.to_device(np.zeros_like(mc_nRQMCTruthVertices, dtype=np.int32))

        threads_per_block = 256
        blocks_per_grid = (num_events + threads_per_block - 1) // threads_per_block

        # Launch kernel
        loop_over_events[blocks_per_grid, threads_per_block](
            ss_x, ss_y, ss_driftTime_ns, ss_correctedS1Area, ss_correctedS2Area,
            mc_nRQMCTruthVertices, mc_volumeName, mc_detectedS1Photons, mc_detectedS2Photons,
            is_mssi, is_FV_mssi, is_FV_ROI_mssi, is_FV_ss, is_FV_ROI_ss
        )

        # Copy results back to host
        is_mssi_host = is_mssi.copy_to_host()
        is_FV_mssi_host = is_FV_mssi.copy_to_host()
        is_FV_ROI_mssi_host = is_FV_ROI_mssi.copy_to_host()
        is_FV_ss_host = is_FV_ss.copy_to_host()
        is_FV_ROI_ss_host = is_FV_ROI_ss.copy_to_host()

        eventWeight = mc['mcTruthEvent.eventWeight'][0]
        f_name = file.split('/SS_skim_')[1][:-5]  # remove .root

        return f_name, len(ss['ss.s1Area_phd']), num_events, sum(is_FV_ss_host), sum(is_FV_ROI_ss_host), sum(is_mssi_host), sum(is_FV_mssi_host), sum(is_FV_ROI_mssi_host), eventWeight
    
else:
    #@numba.njit
    def evaluate_poly(coeffs, x):
        result = 0.0
        for c in coeffs:
            result = result * x + c
        return result

    #@numba.njit
    def loop_over_events(ss, mc):
        is_mssi = np.zeros(len(ss['ss.correctedS1Area_phd']))
        is_FV_mssi = np.zeros(len(ss['ss.correctedS1Area_phd']))
        is_FV_ROI_mssi = np.zeros(len(ss['ss.correctedS1Area_phd']))
        is_FV_ss = np.zeros(len(ss['ss.correctedS1Area_phd']))
        is_FV_ROI_ss = np.zeros(len(ss['ss.correctedS1Area_phd']))


        wall_poly_coeffs = np.array([-8.14589334e-14, 2.09181587e-10, -2.06758029e-07,
                                    1.01366014e-04, -2.69048354e-02, 7.24276394e+01])

        for i in range(len(is_mssi)):
            nS1 = 0
            nS2 = 0
            r = np.sqrt(ss['ss.x_cm'][i] ** 2 + ss['ss.y_cm'][i] ** 2)
            drift_time = ss['ss.driftTime_ns'][i] / 1000.
            boundary_r = evaluate_poly(wall_poly_coeffs, drift_time) - 3
            # Loop over truth vertices
            for j in range(mc['mcTruthVertices.nRQMCTruthVertices'][i]):
                if 'Skin' in str(mc['mcTruthVertices.volumeName'][i][j]) or 'Scint' in str(mc['mcTruthVertices.volumeName'][i][j]):
                    continue
                if mc['mcTruthVertices.detectedS1Photons'][i][j] > 0.:
                    nS1 += 1
                if mc['mcTruthVertices.detectedS2Photons'][i][j] > 0.:
                    nS2 += 1
            if nS1 > nS2:
                is_mssi[i] = 1
                # Apply FV cut
                if r < boundary_r and drift_time < 1030. and drift_time > 71.:
                    is_FV_mssi[i] = 1
                    # Apply ROI
                    if ss['ss.correctedS1Area_phd'][i] < 600 and ss['ss.correctedS1Area_phd'][i] > 3 and np.log10(ss['ss.correctedS2Area_phd'][i]) < 4.5 and ss['ss.s2Area_phd'][i] > 14.5 * 44.5:
                        is_FV_ROI_mssi[i] = 1
            # single scatter rate
            if r < boundary_r and drift_time < 1030. and drift_time > 71.:
                is_FV_ss[i] = 1
                # Apply ROI
                if ss['ss.correctedS1Area_phd'][i] < 600 and ss['ss.correctedS1Area_phd'][i] > 3 and np.log10(ss['ss.correctedS2Area_phd'][i]) < 4.5 and ss['ss.s2Area_phd'][i] > 14.5 * 44.5:
                    is_FV_ROI_ss[i] = 1

        return is_mssi, is_FV_mssi, is_FV_ROI_mssi, is_FV_ss, is_FV_ROI_ss
    

    def process_file(file):
        # Read the file
        branches = ['ss.correctedS1Area_phd', 'ss.correctedS2Area_phd', 'ss.s1Area_phd', 'ss.s2Area_phd', 'ss.x_cm', 'ss.y_cm', 'ss.driftTime_ns']
        mcBranches = ['mcTruthVertices.nRQMCTruthVertices', 'mcTruthVertices.volumeName', 'mcTruthVertices.detectedS1Photons', 'mcTruthVertices.detectedS2Photons', 'mcTruthEvent.eventWeight']

        tfile = up.open(file)
        t = tfile['Scatters']
        mct = tfile['RQMCTruth']

        ss = t.arrays(branches)
        mc = mct.arrays(mcBranches)

        # Now calculate the number of MSSI events
        is_mssi, is_FV_mssi, is_FV_ROI_mssi, is_FV_ss, is_FV_ROI_ss = loop_over_events(ss, mc)
        eventWeight = mc['mcTruthEvent.eventWeight'][0]

        f_name = file.split('/SS_skim_')[1][:-5] # remove .root from the end of the file name

        return f_name, len(ss['ss.s1Area_phd']),  sum(is_FV_ss), sum(is_FV_ROI_ss), sum(is_mssi), sum(is_FV_mssi), sum(is_FV_ROI_mssi), eventWeight

setup dask cluster

In [3]:
cluster = LocalCluster()
client = Client(cluster)
client

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 16
Total threads: 128,Total memory: 502.63 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:37259,Workers: 16
Dashboard: http://127.0.0.1:8787/status,Total threads: 128
Started: Just now,Total memory: 502.63 GiB

0,1
Comm: tcp://127.0.0.1:37515,Total threads: 8
Dashboard: http://127.0.0.1:38403/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:41533,
Local directory: /tmp/dask-scratch-space/worker-2wna4b6f,Local directory: /tmp/dask-scratch-space/worker-2wna4b6f

0,1
Comm: tcp://127.0.0.1:45407,Total threads: 8
Dashboard: http://127.0.0.1:42099/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:44429,
Local directory: /tmp/dask-scratch-space/worker-oyupm3hm,Local directory: /tmp/dask-scratch-space/worker-oyupm3hm

0,1
Comm: tcp://127.0.0.1:43193,Total threads: 8
Dashboard: http://127.0.0.1:39413/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:42459,
Local directory: /tmp/dask-scratch-space/worker-d3jw0ce6,Local directory: /tmp/dask-scratch-space/worker-d3jw0ce6

0,1
Comm: tcp://127.0.0.1:41831,Total threads: 8
Dashboard: http://127.0.0.1:34009/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:34713,
Local directory: /tmp/dask-scratch-space/worker-ckptbuwv,Local directory: /tmp/dask-scratch-space/worker-ckptbuwv

0,1
Comm: tcp://127.0.0.1:37433,Total threads: 8
Dashboard: http://127.0.0.1:33365/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:44775,
Local directory: /tmp/dask-scratch-space/worker-lgvswf1n,Local directory: /tmp/dask-scratch-space/worker-lgvswf1n

0,1
Comm: tcp://127.0.0.1:40375,Total threads: 8
Dashboard: http://127.0.0.1:45919/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:37239,
Local directory: /tmp/dask-scratch-space/worker-wzdmpq27,Local directory: /tmp/dask-scratch-space/worker-wzdmpq27

0,1
Comm: tcp://127.0.0.1:41115,Total threads: 8
Dashboard: http://127.0.0.1:41491/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:38183,
Local directory: /tmp/dask-scratch-space/worker-wsdjpd99,Local directory: /tmp/dask-scratch-space/worker-wsdjpd99

0,1
Comm: tcp://127.0.0.1:44989,Total threads: 8
Dashboard: http://127.0.0.1:33017/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:42461,
Local directory: /tmp/dask-scratch-space/worker-lgtcwswt,Local directory: /tmp/dask-scratch-space/worker-lgtcwswt

0,1
Comm: tcp://127.0.0.1:35017,Total threads: 8
Dashboard: http://127.0.0.1:44907/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:39903,
Local directory: /tmp/dask-scratch-space/worker-gfrv0e5u,Local directory: /tmp/dask-scratch-space/worker-gfrv0e5u

0,1
Comm: tcp://127.0.0.1:45247,Total threads: 8
Dashboard: http://127.0.0.1:33945/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:36187,
Local directory: /tmp/dask-scratch-space/worker-z3_fozvq,Local directory: /tmp/dask-scratch-space/worker-z3_fozvq

0,1
Comm: tcp://127.0.0.1:36437,Total threads: 8
Dashboard: http://127.0.0.1:40339/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:36017,
Local directory: /tmp/dask-scratch-space/worker-jedg102d,Local directory: /tmp/dask-scratch-space/worker-jedg102d

0,1
Comm: tcp://127.0.0.1:35559,Total threads: 8
Dashboard: http://127.0.0.1:39245/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:43745,
Local directory: /tmp/dask-scratch-space/worker-spswdo_y,Local directory: /tmp/dask-scratch-space/worker-spswdo_y

0,1
Comm: tcp://127.0.0.1:39399,Total threads: 8
Dashboard: http://127.0.0.1:36369/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:41871,
Local directory: /tmp/dask-scratch-space/worker-b3ysv64z,Local directory: /tmp/dask-scratch-space/worker-b3ysv64z

0,1
Comm: tcp://127.0.0.1:34965,Total threads: 8
Dashboard: http://127.0.0.1:35663/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:34459,
Local directory: /tmp/dask-scratch-space/worker-aayu5vwc,Local directory: /tmp/dask-scratch-space/worker-aayu5vwc

0,1
Comm: tcp://127.0.0.1:45171,Total threads: 8
Dashboard: http://127.0.0.1:40653/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:46509,
Local directory: /tmp/dask-scratch-space/worker-zmmm4sd9,Local directory: /tmp/dask-scratch-space/worker-zmmm4sd9

0,1
Comm: tcp://127.0.0.1:34381,Total threads: 8
Dashboard: http://127.0.0.1:44109/status,Memory: 31.41 GiB
Nanny: tcp://127.0.0.1:43151,
Local directory: /tmp/dask-scratch-space/worker-jk6ul4id,Local directory: /tmp/dask-scratch-space/worker-jk6ul4id


Select the files to be used. 
In this example, the files are stored locally under `/shared/scratch/ak18773/lz/mssi/`. 
Each file is a ROOT file containing the output of an `LZLAMA` simulation (the `NEST` handler); more details can be found in [arvix:2001.09363](https://arxiv.org/abs/2001.09363)

In [4]:
files = glob.glob("/shared/scratch/ak18773/lz/mssi/*.root")
print(f'N. files to process: {len(files)}')

N. files to process: 10


In [5]:
delayed_results = [dask.delayed(process_file)(file) for file in files]
futures = client.compute(delayed_results)

In [6]:
# monitor the progress
progress(futures)

VBox()

In [None]:
# Once complete, retrieve the results
results = client.gather(futures)

In [None]:
results

In [None]:
results_df = pd.DataFrame(results, columns=['Source', 'nSS', 'nSS FV', 'nSS FV ROI', 'nMSSI', 'nMSSI FV', 'nMSSI FV ROI', 'eventWeight'])
results_df

### Post processing
Now that we have the fraction of events in each region, we can calculate the rates using the known `decays/day`

In [None]:
rates = {
    "Co60_CalibrationSourceTubes": 4690.57902,
    "Co60_DomePMTs": 3885.410702,
    "K40_BottomTruss": 28927.99798,
    "K40_DomePMTs": 88935.50817,
    "Th232-early_BottomTPCPMTBodies": 38003.65201,
    "Th232-late_BottomTPCPMTBases": 20626.61384,
    "Th232-late_BottomTPCPMTBodies": 51716.2229,
    "Th232-late_ForwardFieldResistors": 77545.76613,
    "Th232-late_HVInnerCone": 363483.6619,
    "U238-late_AnodeGridWires": 4316.423461
}
rates_df = pd.DataFrame(list(rates.items()), columns=["Source", "Rate (Decays/day)"])
rates_df

In [None]:
# match up where 'Source' is the same in both dataframes, and combine them
df = pd.merge(results_df, rates_df, on='Source')
df

In [None]:
df['SS/day'] =  df['nSS'] * df['eventWeight'] * df['Rate (Decays/day)']
df['SS/day FV'] = df['nSS FV'] * df['eventWeight'] * df['Rate (Decays/day)']
df['SS/day FV ROI'] = df['nSS FV ROI'] * df['eventWeight'] * df['Rate (Decays/day)']
df['MSSI/day'] = df['nMSSI'] * df['eventWeight'] * df['Rate (Decays/day)']
df['MSSI/day FV'] = df['nMSSI FV'] * df['eventWeight'] * df['Rate (Decays/day)']
df['MSSI/day FV ROI'] = df['nMSSI FV ROI'] * df['eventWeight'] * df['Rate (Decays/day)']

In [None]:
# Calculate the number of events per day from each source
print('Number of SS events expected')
all = df['SS/day'].sum()
in_fv = df['SS/day FV'].sum()
in_fv_roi = df['SS/day FV ROI'].sum()
in_dataset = in_fv_roi * 220
print(f'N. SS / day:   {all}')
print(f'In FV / day:     {in_fv}')
print(f'N. FV ROI / day: {in_fv_roi}')
print(f'In Dataset:      {in_dataset}')
print('----------------------------')
print('Number of MSSI events expected')
all = df['MSSI/day'].sum()
in_fv = df['MSSI/day FV'].sum()
in_fv_roi = df['MSSI/day FV ROI'].sum()
in_dataset = in_fv_roi * 220
print(f'N. MSSI / day:   {all}')
print(f'In FV / day:     {in_fv}')
print(f'N. FV ROI / day: {in_fv_roi}')
print(f'In Dataset:      {in_dataset}')
print('----------------------------')
print('Fraction of events that are MSSI')
fraction = df['MSSI/day FV ROI'].sum() / df['SS/day FV ROI'].sum()
print(f'fraction: {fraction:.2f}')

### How does processing time compare?

On GPU00...
* 14.6s # numba.njit
* 45+mins # regular Python