Add PipInstall WorkerPlugin #3216

mrocklin · 2019-11-09T21:03:39Z

Examples

>>> from dask.distributed import PipInstall
>>> plugin = PipInstall(packages=["dask", "scikit-learn"])
>>> client.register_worker_plugin(plugin)

Examples -------- ```python >>> from dask.distributed import PipInstall >>> plugin = PipInstall(packages=["dask", "scikit-learn"]) >>> client.register_worker_plugin(plugin) ```

mrocklin · 2019-11-09T21:06:52Z

cc @TomAugspurger @rabernat @jhamman

distributed/diagnostics/plugin.py

TomAugspurger · 2019-11-09T22:24:46Z

distributed/diagnostics/plugin.py

+
+    name = "pip"
+
+    def __init__(self, packages=[], restart=False):


Possibly would want to accept an argument for options passed to pip. Things like --upgrade, --pre, etc.

So perhaps a pip_options=None that's a list of strings, just like in Popen.

* Add doc warning / note * Mock the tests * Added pip_options

TomAugspurger · 2019-11-19T21:25:03Z

Pushed a few changes here.

@jcrist do you have any objections to the overall idea of doing this in a worker plugin?

mrocklin · 2019-11-19T22:05:08Z

Conceptually there are issues here around restarting. We want to restart things if we have different versions than what we just installed. However it's also possible that another process just updated our versions, and we should restart, even though when we call `pip install` everything seems like it is already installed.

…

On Tue, Nov 19, 2019 at 2:25 PM Tom Augspurger ***@***.***> wrote: Pushed a few changes here. @jcrist <https://github.com/jcrist> do you have any objections to the overall idea of doing this in a worker plugin? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#3216?email_source=notifications&email_token=AACKZTCLRZF7PR24CWYRQEDQURKTBA5CNFSM4JLJRQM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEP2MIA#issuecomment-555722272>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBX47EL4MTD4XPRRX3QURKTBANCNFSM4JLJRQMQ> .

kyprifog · 2020-09-30T15:58:19Z

Whats the status of this effort? I just ran into an issue running dask-gateway with not having s3fs on the workers and I'm trying to find the path of least resistence.

TomAugspurger · 2020-10-01T13:30:38Z

I pushed a couple updates to

fix merge conflicts
add a test for an install failures (e.g. package typo)
added the PipInstall plugin to the docs

I think this should be good to go.

mrocklin · 2020-10-01T18:52:34Z

I think that there are likely still issues when you have multiple workers on the same machine pip/conda installing things. However, this may be an uncommon enough case where this is likely to be used that we shouldn't care. I'm not sure. I think that it's totally ok to merge anyway.

kyprifog · 2020-10-01T18:58:24Z

@mrocklin, to clarify are you saying there are issues with multiple clients trying to issue (possibly contradictory or overlapping) commands to the workers of the same cluster to update packages?

@TomAugspurger I am trying to test this (rudimentarily) by doing this. Am I doing it wrong? I am still getting a package not found s3fs error from the workers:

with Client() as client:
    plugin = PipInstall(packages=["s3fs"], pip_options=["--upgrade"], restart=True)
    client.register_worker_plugin(plugin)
    df = dd.read_csv(files)
    df.count().compute()

>>
dask-worker-4c0dd1f1d56b4050ba4fc4e1e89edd44-vb7cg dask-worker ModuleNotFoundError: No module named 's3fs'

mrocklin · 2020-10-01T19:34:45Z

I'm saying that if multiple workers are on the same filesystem (such as happens when you say dask-worker --nprocs 2) then they will both try to pip/conda install and both try to restart, sort of. Things can fail easily in that situation. What you're suggesting is also an issue, but one that seems much less likely to occur.

…

On Thu, Oct 1, 2020 at 11:58 AM Kyle Prifogle ***@***.***> wrote: @mrocklin <https://github.com/mrocklin>, to clarify are you saying there are issues with multiple clients trying to issue (possibly contradictory or overlapping) commands to the workers of the same cluster to update packages? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3216 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBKBKG2LUJVMOSKF33SITGN7ANCNFSM4JLJRQMQ> .

TomAugspurger · 2020-10-01T19:51:04Z

I am still getting a package not found s3fs error from the workers:

I wonder if the installation / restart fails to complete before you start your computation requiring s3fs?

The issue with multiple workers sharing a filesystem is a good one I hadn't considered. I'm OK with just documenting that shortcoming for now.

[ci skip]

kyprifog · 2020-10-02T15:47:44Z

I am still getting a package not found s3fs error from the workers:

I wonder if the installation / restart fails to complete before you start your computation requiring s3fs?

The issue with multiple workers sharing a filesystem is a good one I hadn't considered. I'm OK with just documenting that shortcoming for now.

I had wondered the same thing but if that were the case running the same operation twice would result in a positive response. That being said, it didn't appear that the workers restarted in kubernetes so maybe I am doing something else wrong. I think maybe there is also a chance something weird with conda is happening, the command that finally worked when run on each worker was:

conda install --yes -c conda-forge s3fs==0.4.2

So i had to use conda and downgrade s3fs (I was getting an error form fsspec about "other_paths" with 0.5.0 and 0.5.1)

For context, I'm trying to avoid forking the helm chart for daskhub which is why I'm spending more time trying to figure out how to do this via plugin.

Its possible the issue is actually related to conda environments.

kyprifog · 2020-10-02T16:51:59Z

Update: I was able to get my issue resolved by building my own conda install worker plugin.

TomAugspurger · 2020-10-06T11:29:52Z

That being said, it didn't appear that the workers restarted in kubernetes

What do you mean by this? I don't believe this would be visible at the kubernetes level (other than via logs), since no pods / processes are going away.

kyprifog · 2020-10-06T12:13:53Z

Yeah I meant via logs, I was running stern dask and non of the workers logged any changes. On my own plugin, I just ended up doing it synchronously and not restarting the workers bc I couldn't figure out what was going on and it seemed to work. 🤷 Maybe it was actually working, dask-gateway was just expecting conda install, idk.

TomAugspurger · 2020-10-06T13:09:58Z

Gotcha, thanks for clarifying.

TomAugspurger · 2020-10-09T16:06:58Z

I'm planning to merge this later today.

mrocklin · 2020-10-09T16:21:17Z

+1

…

On Fri, Oct 9, 2020 at 9:07 AM Tom Augspurger ***@***.***> wrote: I'm planning to merge this later today. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3216 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBONRJMLMVTLPQUZH3SJ4YLJANCNFSM4JLJRQMQ> .

Add PipInstall WorkerPlugin

7f6d520

Examples -------- ```python >>> from dask.distributed import PipInstall >>> plugin = PipInstall(packages=["dask", "scikit-learn"]) >>> client.register_worker_plugin(plugin) ```

mrocklin mentioned this pull request Nov 9, 2019

Add CondaPipInstall WorkerPlugin #3111

Open

mrocklin mentioned this pull request Nov 9, 2019

Pip install packages from configuration in Nanny.start #3217

Closed

TomAugspurger reviewed Nov 9, 2019

View reviewed changes

Updates

ae01f84

* Add doc warning / note * Mock the tests * Added pip_options

jcrist mentioned this pull request Nov 20, 2019

Adding packages to dask-gateway on new_cluster call dask/dask-gateway#159

Closed

TomAugspurger mentioned this pull request Mar 30, 2020

Cannonical pangeo binder? pangeo-data/pangeo-docker-images#18

Closed

TomAugspurger mentioned this pull request Jul 2, 2020

Add gsw pangeo-data/pangeo-docker-images#91

Merged

mrocklin mentioned this pull request Jul 3, 2020

Installing modules on workers #1200

Closed

TomAugspurger added 5 commits October 1, 2020 06:44

Merge remote-tracking branch 'upstream/master' into pip-install

a6ca897

Add failure test

8270f1d

Add failure test

5bb2e24

doc

ca0dae6

Update

2643f88

more caveats

95fb5ec

[ci skip]

TomAugspurger merged commit 946c6a4 into dask:master Oct 9, 2020

TomAugspurger mentioned this pull request Oct 13, 2020

Document installing packages on workers pangeo-data/pangeo#792

Merged

quasiben mentioned this pull request Oct 27, 2020

Add to FAQ section about how to get workers to import right #4190

Closed

mrocklin deleted the pip-install branch November 18, 2020 15:53

mrocklin mentioned this pull request Nov 18, 2020

Add documentation for setting environment variables (e.g. for installing extra packages) dask/dask-cloudprovider#169

Open

Uh oh!

Add PipInstall WorkerPlugin #3216

Add PipInstall WorkerPlugin #3216

Uh oh!

Conversation

mrocklin commented Nov 9, 2019

Examples

Uh oh!

mrocklin commented Nov 9, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TomAugspurger Nov 9, 2019

Choose a reason for hiding this comment

Uh oh!

TomAugspurger commented Nov 19, 2019

Uh oh!

mrocklin commented Nov 19, 2019 via email

Uh oh!

kyprifog commented Sep 30, 2020

Uh oh!

TomAugspurger commented Oct 1, 2020

Uh oh!

mrocklin commented Oct 1, 2020

Uh oh!

kyprifog commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrocklin commented Oct 1, 2020 via email

Uh oh!

TomAugspurger commented Oct 1, 2020

Uh oh!

kyprifog commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyprifog commented Oct 2, 2020

Uh oh!

TomAugspurger commented Oct 6, 2020

Uh oh!

kyprifog commented Oct 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomAugspurger commented Oct 6, 2020

Uh oh!

TomAugspurger commented Oct 9, 2020

Uh oh!

mrocklin commented Oct 9, 2020 via email

Uh oh!

Uh oh!

kyprifog commented Oct 1, 2020 •

edited

Loading

kyprifog commented Oct 2, 2020 •

edited

Loading

kyprifog commented Oct 6, 2020 •

edited

Loading