# Distributed Dask tips for coffea-casa

Dask works well at many scales ranging from a single machine to clusters of many machines. In our case we provide each user already preconfigured resource ready to be scale.

![dask distributed](https://docs.dask.org/en/latest/_images/dask-cluster-manager.svg)

The dask Client is the primary entry point for users of `dask.distributed`.

We pre-configured a Dask cluster for you automatically, and you just need to initialize a Client by pointing it to the address of a Scheduler (in coffea-casa it is always `tls://localhost:8786`):

In [1]:
from dask.distributed import Client

client = Client("tls://localhost:8786")
client

  from pandas.core.computation.check import NUMEXPR_INSTALLED


0,1
Connection method: Direct,
Dashboard: /user/oksana.shadura@cern.ch/proxy/8787/status,

0,1
Comm: tls://192.168.202.25:8786,Workers: 5
Dashboard: /user/oksana.shadura@cern.ch/proxy/8787/status,Total threads: 5
Started: 6 minutes ago,Total memory: 14.31 GiB

0,1
Comm: tls://red-c7122.unl.edu:32813,Total threads: 1
Dashboard: /user/oksana.shadura@cern.ch/proxy/36383/status,Memory: 2.86 GiB
Nanny: tls://172.19.0.7:46431,
Local directory: /var/lib/condor/execute/dir_149518/dask-scratch-space/worker-je8xvn_u,Local directory: /var/lib/condor/execute/dir_149518/dask-scratch-space/worker-je8xvn_u
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 2.0%,Last seen: Just now
Memory usage: 171.09 MiB,Spilled bytes: 0 B
Read bytes: 330.2902762238508 B,Write bytes: 1.50 kiB

0,1
Comm: tls://red-c7122.unl.edu:42765,Total threads: 1
Dashboard: /user/oksana.shadura@cern.ch/proxy/46543/status,Memory: 2.86 GiB
Nanny: tls://172.19.0.8:40303,
Local directory: /var/lib/condor/execute/dir_149519/dask-scratch-space/worker-qd590ng2,Local directory: /var/lib/condor/execute/dir_149519/dask-scratch-space/worker-qd590ng2
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 2.0%,Last seen: Just now
Memory usage: 174.72 MiB,Spilled bytes: 0 B
Read bytes: 330.96698887536326 B,Write bytes: 1.50 kiB

0,1
Comm: tls://red-c7122.unl.edu:43661,Total threads: 1
Dashboard: /user/oksana.shadura@cern.ch/proxy/39193/status,Memory: 2.86 GiB
Nanny: tls://172.19.0.10:42263,
Local directory: /var/lib/condor/execute/dir_149521/dask-scratch-space/worker-9ojv1s9e,Local directory: /var/lib/condor/execute/dir_149521/dask-scratch-space/worker-9ojv1s9e
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 2.0%,Last seen: Just now
Memory usage: 172.07 MiB,Spilled bytes: 0 B
Read bytes: 329.1327125293137 B,Write bytes: 1.49 kiB

0,1
Comm: tls://red-c7122.unl.edu:34261,Total threads: 1
Dashboard: /user/oksana.shadura@cern.ch/proxy/41881/status,Memory: 2.86 GiB
Nanny: tls://172.19.0.9:36963,
Local directory: /var/lib/condor/execute/dir_149523/dask-scratch-space/worker-12kk_ymv,Local directory: /var/lib/condor/execute/dir_149523/dask-scratch-space/worker-12kk_ymv
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 4.0%,Last seen: Just now
Memory usage: 170.77 MiB,Spilled bytes: 0 B
Read bytes: 330.79264868471773 B,Write bytes: 1.50 kiB

0,1
Comm: tls://red-c7122.unl.edu:46041,Total threads: 1
Dashboard: /user/oksana.shadura@cern.ch/proxy/45137/status,Memory: 2.86 GiB
Nanny: tls://172.19.0.11:36221,
Local directory: /var/lib/condor/execute/dir_149525/dask-scratch-space/worker-77je7jdt,Local directory: /var/lib/condor/execute/dir_149525/dask-scratch-space/worker-77je7jdt
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 2.0%,Last seen: Just now
Memory usage: 171.89 MiB,Spilled bytes: 0 B
Read bytes: 330.18648736547436 B,Write bytes: 1.50 kiB


At coffea-casa we are using adaptive scaling to optimise resource usage.

## Dask dependency management plugins

Dask’s plugin system enables you to run custom Python code for certain events. You can use plugins that are specific to schedulers, workers, or nannies. A worker plugin, for example, allows you to run custom Python code on all your workers at certain event in the worker’s lifecycle (e.g. when the worker process is started). Let's check dependency management plugins allowing you to install packages on workers:

In [2]:
from dask.distributed import PipInstall

plugin = PipInstall(packages=["hepconvert"], pip_options=["--upgrade"])

client.register_plugin(plugin)

Or we can simply execute custom function on worker:

In [3]:
def worker_setup(dask_worker):
    import os
    install_root_packages_cmd = "mamba install -y -c conda-forge root"
    os.system(install_root_packages_cmd)

In [4]:
client.register_worker_callbacks(worker_setup)

{'tls://red-c7122.unl.edu:32813': {'status': 'OK'},
 'tls://red-c7122.unl.edu:34261': {'status': 'OK'},
 'tls://red-c7122.unl.edu:42765': {'status': 'OK'},
 'tls://red-c7122.unl.edu:43661': {'status': 'OK'},
 'tls://red-c7122.unl.edu:46041': {'status': 'OK'}}

Or to enable CMSSW environmnet:

In [5]:
 def worker_cmssw_setup(dask_worker):
    import os
    install_cmssw_packages_cmd = "source /cvmfs/cms.cern.ch/cmsset_default.sh; cd /cvmfs/cms.cern.ch/${SCRAM_ARCH}/cms/cmssw/CMSSW_12_6_5; cmsenv"
    os.system(install_cmssw_packages_cmd)

In [6]:
client.register_worker_callbacks(worker_cmssw_setup)

{'tls://red-c7122.unl.edu:32813': {'status': 'OK'},
 'tls://red-c7122.unl.edu:34261': {'status': 'OK'},
 'tls://red-c7122.unl.edu:42765': {'status': 'OK'},
 'tls://red-c7122.unl.edu:43661': {'status': 'OK'},
 'tls://red-c7122.unl.edu:46041': {'status': 'OK'}}

Or enable environment variable:

In [7]:
def set_env(dask_worker):
        import pathlib, os
        path = str(pathlib.Path(dask_worker.local_directory))
        os.environ["HOME_DIR"] = path

In [8]:
client.register_worker_callbacks(set_env)

{'tls://red-c7122.unl.edu:32813': {'status': 'OK'},
 'tls://red-c7122.unl.edu:34261': {'status': 'OK'},
 'tls://red-c7122.unl.edu:42765': {'status': 'OK'},
 'tls://red-c7122.unl.edu:43661': {'status': 'OK'},
 'tls://red-c7122.unl.edu:46041': {'status': 'OK'}}

You can simply create your plugins:

In [25]:
from dask.distributed import WorkerPlugin
class ErrorLogger(WorkerPlugin):
    def __init__(self, logger):
        self.logger = logger

    def setup(self, worker):
        self.worker = worker

    def transition(self, key, start, finish, *args, **kwargs):
        if finish == 'error':
            ts = self.worker.tasks[key]
            exc_info = (type(ts.exception), ts.exception, ts.traceback)
            self.logger.error(
                "Error during computation of '%s'.", key,
                exc_info=exc_info
            )

In [26]:
import logging
plugin = ErrorLogger(logging)
client.register_plugin(plugin) 

{'tls://red-c7123.unl.edu:35675': {'status': 'OK'},
 'tls://red-c7123.unl.edu:37019': {'status': 'OK'}}

Or you can upload your file using plugin `UploadFile` (`UploadDirectory` doesnt work for now on coffea-casa, we are fixing it):

In [46]:
from distributed.diagnostics.plugin import UploadFile

client.register_plugin(UploadFile("test_cc.py"))  

{'tls://red-c7123.unl.edu:35675': {'status': 'OK'},
 'tls://red-c7123.unl.edu:37019': {'status': 'OK'}}