Skip to content
Browse files

Update parallelism example to work with latest sklearn and dask

  • Loading branch information...
dnouri committed Sep 13, 2019
1 parent a361bc1 commit 10c3a38c6a2f5bbec38622f61039702a44617d2b
Showing with 17 additions and 15 deletions.
  1. +13 −10 docs/user/parallelism.rst
  2. +4 −5 examples/rnn_classifer/
@@ -32,25 +32,28 @@ packages required to do the work::
CUDA_VISIBLE_DEVICES=0 dask-worker --nthreads 1
CUDA_VISIBLE_DEVICES=1 dask-worker --nthreads 1

In your code, use joblib's ``parallel_backend`` to choose the Dask
backend for grid searches and the like. Remember to also import the
``distributed.joblib`` module, as that will register the joblib
backend. Let's see how this could look like:
In your code, use joblib's :func:`~joblib.parallel_backend` context
manager to activate the Dask backend when you run grid searches and
the like. Also instantiate a :class:`dask.distributed.Client` to
point to the Dask scheduler that you want to use. Let's see how this
could look like:

.. code:: python
import distributed.joblib # imported for side effects
from sklearn.externals.joblib import parallel_backend
from dask.distributed import Client
from joblib import parallel_backend
client = Client('')
X, y = load_my_data()
model = get_that_model()
net = get_that_net()
gs = GridSearchCV(
param_grid={'net__lr': [0.01, 0.03]},
param_grid={'lr': [0.01, 0.03]},
with parallel_backend('dask.distributed', scheduler_host=''):
with parallel_backend('dask'):, y)
@@ -7,12 +7,11 @@
'param_grid': {'__copy__': 'grid_search.param_grid'},
'scoring': {'__copy__': 'scoring'},
'backend': 'dask.distributed',
'scheduler_host': '',
'backend': 'dask',

'_init_distributed': {
'__factory__': 'palladium.util.resolve_dotted_name',
'dotted_name': 'distributed.joblib.joblib',
'_init_client': {
'__factory__': 'dask.distributed.Client',
'address': '',

0 comments on commit 10c3a38

Please sign in to comment.
You can’t perform that action at this time.