Suppose you have some custom modules, named pipeline and external_fns, and want to use them on a coiled cluster.  And, further suppose that these are contained in folders within your current working directory for Python. 

These can be installed on workers in a dask scheduler by using the Built in Dask Distributed Nanny Plugin `UploadDirectory` (http://distributed.dask.org/en/stable/plugins.html). As part of that, it's important to ensure that the workers know where to find the modules.

In priciple, it should be possible to ensure that by using the kwarg `update_path=True`, but at present this seems to not be sufficient, at least when not working with at LocalCluster.  

For a LocalCluster, it appears that the system path pointing to the current working directory is sufficient for workers to be able to find the modules.  On a Coiled cluster, for now at least, it is necessary to programatically ensure that the worker paths are updated.  

This notebook shows how to use the UploadDirectory Nanny Plugin, and to update worker paths.

Code based on input from Kelsey Skvoretz (https://github.com/skvorekn; https://github.com/skvorekn/repr-coiled-upl-dir) and James Bourbeau (https://github.com/jrbourbeau)

In [1]:
# Imports you will need
# In addition to your custom modules, your current environment will need to include 
# dask & distributed, with all of their depenencies.  These will already be in a 
# coiled default environment.

# Standard Packages
import os

# Specialty Packages
from dask.distributed import Client, LocalCluster
from distributed.diagnostics.plugin import UploadDirectory

# Coiled
import coiled

# Your custom modules.
from external_fns.misc import get_prefix
from pipeline.functions.item_level import runner

In [2]:
# Create a Cluster

get_prefix()

cluster = coiled.Cluster(
            name='upload-directory-test',
            n_workers=4,
            worker_cpu=1,
            # worker_class='distributed.Nanny'   
        )
client = Client(cluster)

client.wait_for_workers(n_workers=4)

print("Created client")

Output()

Created fw rule: inbound [8786-8787] [0.0.0.0/0] []
Created FW rules: coiled-dask-greg-sm27-100462-firewall
Created fw rule: cluster [0-65535] [None] [coiled-dask-greg-sm27-100462-firewall -> coiled-dask-greg-sm27-100462-firewall]
Created FW rules: coiled-dask-greg-sm27-100462-cluster-firewall
Created fw rule: cluster [0-65535] [None] [coiled-dask-greg-sm27-100462-cluster-firewall -> coiled-dask-greg-sm27-100462-cluster-firewall]
Created scheduler VM: coiled-dask-greg-sm27-100462-scheduler (type: t3a.medium, ip: ['44.192.5.182'])



+-------------+-----------+-----------+---------+
| Package     | client    | scheduler | workers |
+-------------+-----------+-----------+---------+
| dask        | 2021.12.0 | 2022.01.0 | None    |
| distributed | 2021.12.0 | 2022.01.0 | None    |
| msgpack     | 1.0.2     | 1.0.3     | None    |
+-------------+-----------+-----------+---------+
Notes: 
-  msgpack: Variation is ok, as long as everything is above 0.6


Created client


In [3]:
# show current path

def show_path(dask_worker):
    import pathlib
    path = str(pathlib.Path(dask_worker.local_directory).parent)   
    return path

In [4]:
client.run(show_path)

{'tls://10.4.4.5:38177': '/dask-worker-space',
 'tls://10.4.6.228:43067': '/dask-worker-space',
 'tls://10.4.8.214:36103': '/dask-worker-space',
 'tls://10.4.9.139:37375': '/dask-worker-space'}

In [5]:
# Function to update paths on workers & code to upload modules

def update_path(dask_worker):
    import pathlib
    import sys
    path = str(pathlib.Path(dask_worker.local_directory).parent)
    if path not in sys.path:
        sys.path.insert(0, path)

client.run(update_path)

plugin = UploadDirectory('pipeline', update_path=False, restart=False)
client.register_worker_plugin(plugin) 

print("Client Path Updated")

Client Path Updated


In [6]:
client.run(show_path)

{'tls://10.4.4.5:38177': '/dask-worker-space',
 'tls://10.4.6.228:43067': '/dask-worker-space',
 'tls://10.4.8.214:36103': '/dask-worker-space',
 'tls://10.4.9.139:37375': '/dask-worker-space'}

In [7]:
# See what the directory structure looks like
def test_func():
    dirs = []
    for d in os.walk('dask-worker-space'):
        dirs.append(d)
    return dirs

# job = client.submit(test_func)
# print(job.result())

client.run(test_func)

# Example output:
# [
#     (
#         'dask-worker-space',
#         ['worker-pvqyc2yh', 'pipeline'],
#         ['worker-pvqyc2yh.dirlock', 'purge.lock', 'global.lock']
#     ),
#     ('dask-worker-space/worker-pvqyc2yh', ['storage'], []),
#     ('dask-worker-space/worker-pvqyc2yh/storage', [], []),
#     ('dask-worker-space/pipeline', ['functions'], ['__init__.py', 'errors.py']),
#     ('dask-worker-space/pipeline/functions', [], ['__init__.py', 'item_level.py'])
# ]

{'tls://10.4.4.5:38177': [('dask-worker-space',
   ['pipeline', 'worker-c92zdhe6'],
   ['purge.lock', 'global.lock', 'worker-c92zdhe6.dirlock']),
  ('dask-worker-space/pipeline', ['functions'], ['__init__.py', 'errors.py']),
  ('dask-worker-space/pipeline/functions',
   ['.ipynb_checkpoints'],
   ['__init__.py', 'item_level.py']),
  ('dask-worker-space/pipeline/functions/.ipynb_checkpoints',
   [],
   ['__init__-checkpoint.py', 'item_level-checkpoint.py']),
  ('dask-worker-space/worker-c92zdhe6', ['storage'], []),
  ('dask-worker-space/worker-c92zdhe6/storage', [], [])],
 'tls://10.4.6.228:43067': [('dask-worker-space',
   ['pipeline', 'worker-kr5pxl7s'],
   ['purge.lock', 'worker-kr5pxl7s.dirlock', 'global.lock']),
  ('dask-worker-space/pipeline', ['functions'], ['__init__.py', 'errors.py']),
  ('dask-worker-space/pipeline/functions',
   ['.ipynb_checkpoints'],
   ['__init__.py', 'item_level.py']),
  ('dask-worker-space/pipeline/functions/.ipynb_checkpoints',
   [],
   ['__init__-chec

In [8]:
# Show that all works. 
runner(client)

['test', 'list']
['test', 'list']
['test', 'list']
['test', 'list']
['test', 'list']


In [9]:
coiled.list_clusters()

{'upload-directory-test': {'id': 100462,
  'status': 'running',
  'account': 'greg-smith',
  'private_address': 'tls://10.4.14.205:8786',
  'dashboard_address': 'http://44.192.5.182:8787',
  'configuration': 434,
  'vm_type': 't3a.medium',
  'workers': [{'name': 'coiled-dask-greg-sm27-100462-worker-d3f7045e4c',
    'status': 'running',
    'vm_type': 'm5zn.large'},
   {'name': 'coiled-dask-greg-sm27-100462-worker-038a7bd11e',
    'status': 'running',
    'vm_type': 'm5zn.large'},
   {'name': 'coiled-dask-greg-sm27-100462-worker-c5f0eba3a8',
    'status': 'running',
    'vm_type': 'm5zn.large'},
   {'name': 'coiled-dask-greg-sm27-100462-worker-a945bf5e6a',
    'status': 'running',
    'vm_type': 'm5zn.large'}],
  'address': 'tls://44.192.5.182:8786'}}

In [10]:
def get_status(dask_worker):
    return dask_worker.status

client.run(get_status)

{'tls://10.4.4.5:38177': <Status.running: 'running'>,
 'tls://10.4.6.228:43067': <Status.running: 'running'>,
 'tls://10.4.8.214:36103': <Status.running: 'running'>,
 'tls://10.4.9.139:37375': <Status.running: 'running'>}

In [11]:
# Clean up

client.close()
cluster.close()