Skip to content

dbalabka/dask-module-upload-plugin-demo

Repository files navigation

Dask Module Upload Plugin Demo

This repository demonstrates a solution to the problem of distributing local Python packages to a Dask cluster.

The Problem

When working with a Dask cluster, it's often necessary to use custom modules and packages within your distributed tasks. Dask's default mechanism for uploading modules only supports single-file modules, not entire packages. This makes it challenging to manage and distribute more complex codebases to the Dask scheduler and workers.

This limitation is discussed in the Dask community here:

The Solution

This demo showcases a Dask plugin that allows you to upload entire Python packages to your Dask cluster. This simplifies dependency management and makes it easier to work with your own libraries in a distributed environment.

The implementation of this solution is proposed in this pull request:

Demo

The demo.ipynb notebook in this repository provides a hands-on example of the plugin in action. It walks you through the following steps:

  1. Setting up a Dask cluster: A Dask cluster is created on Google Cloud Platform.
  2. Demonstrating the problem: It shows that trying to use a local package (dask_module_upload_plugin_demo) in a distributed task fails with a ModuleNotFoundError.
  3. Using the plugin: The UploadModule and SchedulerUploadModule plugins are registered with the Dask client.
  4. Success! The same distributed task is executed again, and this time it succeeds because the plugin has uploaded the necessary package to the cluster.

Usage

To run the demo, you will need to have Python and Jupyter Notebook installed. You will also need to install the dependencies listed in pyproject.toml.

  1. Clone this repository.
  2. Install the dependencies: poetry install
  3. Set up your GCP credentials. You can do this by creating a .env file with the following content:
    GCP_PROJECT_ID=...
    GCP_ZONE=...
    
  4. Open and run the demo.ipynb notebook.

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published