-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PipInstall WorkerPlugin #3216
Conversation
Examples -------- ```python >>> from dask.distributed import PipInstall >>> plugin = PipInstall(packages=["dask", "scikit-learn"]) >>> client.register_worker_plugin(plugin) ```
distributed/diagnostics/plugin.py
Outdated
|
||
name = "pip" | ||
|
||
def __init__(self, packages=[], restart=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly would want to accept an argument for options passed to pip. Things like --upgrade
, --pre
, etc.
So perhaps a pip_options=None
that's a list of strings, just like in Popen.
Pushed a few changes here. @jcrist do you have any objections to the overall idea of doing this in a worker plugin? |
Conceptually there are issues here around restarting. We want to restart
things if we have different versions than what we just installed. However
it's also possible that another process just updated our versions, and we
should restart, even though when we call `pip install` everything seems
like it is already installed.
…On Tue, Nov 19, 2019 at 2:25 PM Tom Augspurger ***@***.***> wrote:
Pushed a few changes here.
@jcrist <https://github.com/jcrist> do you have any objections to the
overall idea of doing this in a worker plugin?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3216?email_source=notifications&email_token=AACKZTCLRZF7PR24CWYRQEDQURKTBA5CNFSM4JLJRQM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEP2MIA#issuecomment-555722272>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTBX47EL4MTD4XPRRX3QURKTBANCNFSM4JLJRQMQ>
.
|
Whats the status of this effort? I just ran into an issue running dask-gateway with not having s3fs on the workers and I'm trying to find the path of least resistence. |
I pushed a couple updates to
I think this should be good to go. |
I think that there are likely still issues when you have multiple workers on the same machine pip/conda installing things. However, this may be an uncommon enough case where this is likely to be used that we shouldn't care. I'm not sure. I think that it's totally ok to merge anyway. |
@mrocklin, to clarify are you saying there are issues with multiple clients trying to issue (possibly contradictory or overlapping) commands to the workers of the same cluster to update packages? @TomAugspurger I am trying to test this (rudimentarily) by doing this. Am I doing it wrong? I am still getting a package not found s3fs error from the workers:
|
I'm saying that if multiple workers are on the same filesystem (such as
happens when you say dask-worker --nprocs 2) then they will both try to
pip/conda install and both try to restart, sort of. Things can fail easily
in that situation.
What you're suggesting is also an issue, but one that seems much less
likely to occur.
…On Thu, Oct 1, 2020 at 11:58 AM Kyle Prifogle ***@***.***> wrote:
@mrocklin <https://github.com/mrocklin>, to clarify are you saying there
are issues with multiple clients trying to issue (possibly contradictory or
overlapping) commands to the workers of the same cluster to update packages?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3216 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTBKBKG2LUJVMOSKF33SITGN7ANCNFSM4JLJRQMQ>
.
|
I wonder if the installation / restart fails to complete before you start your computation requiring s3fs? The issue with multiple workers sharing a filesystem is a good one I hadn't considered. I'm OK with just documenting that shortcoming for now. |
[ci skip]
I had wondered the same thing but if that were the case running the same operation twice would result in a positive response. That being said, it didn't appear that the workers restarted in kubernetes so maybe I am doing something else wrong. I think maybe there is also a chance something weird with conda is happening, the command that finally worked when run on each worker was:
So i had to use conda and downgrade s3fs (I was getting an error form fsspec about "other_paths" with 0.5.0 and 0.5.1) For context, I'm trying to avoid forking the helm chart for daskhub which is why I'm spending more time trying to figure out how to do this via plugin. Its possible the issue is actually related to conda environments. |
Update: I was able to get my issue resolved by building my own conda install worker plugin. |
What do you mean by this? I don't believe this would be visible at the kubernetes level (other than via logs), since no pods / processes are going away. |
Yeah I meant via logs, I was running |
Gotcha, thanks for clarifying. |
I'm planning to merge this later today. |
+1
…On Fri, Oct 9, 2020 at 9:07 AM Tom Augspurger ***@***.***> wrote:
I'm planning to merge this later today.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3216 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTBONRJMLMVTLPQUZH3SJ4YLJANCNFSM4JLJRQMQ>
.
|
Examples