-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greedy cluster client #36
Conversation
Fixes #35 |
Confirmed that this also works nicely when I restart. Pretty slick @ian-r-rose ! My guess is that we don't want this behavior on by default at first. Thoughts on the right way to manage configuration? |
For context. @ian-r-rose and I were talking about ways to enable administrators to create clusters for users that they never have to interact with. For example I think that the combination of this PR along with #37 will allow for Pangeo or example.dask.org users to just start with |
I agree that this behavior should not be on by default. I think the best way to approach it would be via the JupyterLab settings system. |
a client for the currently active Dask cluster.
selected if available.
Okay, I think this is in reasonable shape. The current behavior:
|
cb5f6bc
to
09c4b37
Compare
@ian-r-rose - this looks really cool. I'll try to give it a test spin next week. Also pinging @jacobtomlinson and @niallrobinson for comment since this is something they have been interested in. |
@jhamman Did you get a chance to play with this? |
I suggest the name auto-start rather than greedy. Is there a way for us to control this in a Jupyter config file? If so, what does that config file look like? |
Also, we can maybe drop the term |
Thanks for the suggestions @mrocklin. If I understand correctly, you are suggesting something like
Is that right? This is driven by the client-side, so it can't be controlled in the server-side |
I'm questioning the use of both the terms cluster and client. These seem to be specific to how dask works. I suspect that from a novice user's perspective they just want "Dask" to start. They may not know about clusters or clients or whatnot. |
Sure, that makes sense to me. The setting name is a bit more targeted towards experts and admins, so I might advocate being a more specific there and retaining "Client." The menu item is much more user-facing, so removing "Client" works for me. Regarding including "dask": the setting name is already namespaced by being in the Dask settings, but the menu label should have "Dask" in it, as you suggest. |
OK. That makes sense to me.
…On Wed, Jan 9, 2019 at 3:38 PM Ian Rose ***@***.***> wrote:
Sure, that makes sense to me. The setting name is a bit more targeted
towards experts and admins, so I might advocate being a more specific there
and retaining "Client." The menu item is much more user-facing, so removing
"Client" works for me.
Regarding including "dask": the setting name is already namespaced by
being in the Dask settings, but the menu label should have "Dask" in it, as
you suggest.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszMGfbJFlHYrfgWQ70hY0xSyau6dSks5vBn1agaJpZM4ZfVDs>
.
|
Ping @jbusecke for a review of this. Julius is a postdoc at Princeton and a heavy user of pangeo on HPC. He mentioned that he was interested in this extension. Perhaps he can take this PR for a spin and give some feedback. |
Oh this looks super sweet! I will definitely try to take this for a spin on the HPCs I am working on in the next days! |
Hi @jhamman, you can toggle it in the Settings menu under "Auto-Start Dask". |
Although I am having a hard time with that binder link because clusters are not starting for some reason. |
The logs I'm getting on the binder is:
should I be working off development versions of dask/distributed/dask-kubernetes? |
OK, it looks like we'll have to make dask-kubernetes more async friendly.
This is probably on me (or maybe @yuvipanda depending on when he gets to
some async kubernetes work)
…On Thu, Jan 10, 2019 at 8:14 PM Joe Hamman ***@***.***> wrote:
The logs I'm getting on the binder is:
[W 04:12:24.249 LabApp] object KubeCluster can't be used in 'await' expression
[I 04:12:25.435 LabApp] Client sent subprotocols: ['']
[I 04:12:25.436 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n
[I 04:12:25.440 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n
[I 04:12:25.725 LabApp] Client sent subprotocols: ['']
[I 04:12:25.726 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL
[I 04:12:25.729 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL
should I be working off development versions of
dask/distributed/dask-kubernetes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszHhQGWCGIOxx19NiUqNwcGLlFnpLks5vCA-VgaJpZM4ZfVDs>
.
|
dask/dask-kubernetes#116 may work? Untested |
You'll also need distributed master |
I've switched the above binder to use the local cluster (for now) and to use distributed/master. Things still don't seem to be working so I'll keep investigating as time allows.
Is it possible to set this from outside the lab environment. Ideally, I could set the default value from the |
@jhamman Yes, you can set it from outside of lab. The settings file is a JSON file on disk, so you can seed the docker image with that in place (I don't think the |
here is a commit that auto-sets the setting in your binder example. It seems to work well, but there is still something broken with the cluster config... |
Just to update this PR I think that we should get async-kubernetes working in dask-kubernetes, and then revisit this. It would also be good to test this with dask-jobqueue and dask-yarn. Mostly this makes the request that clusters can be started and stopped asynchronously. I suspect that this is already true for Dask-Jobqueue, but it would be good to test in practice (cc @guillaumeeb). I don't know the current status in dask-yarn (cc @jcrist ) |
No rush on this, happy to let people kick the wheels. That being said, isn't the async cluster starting already in master (#37)? |
Hrm, indeed...
…On Wed, Jan 23, 2019 at 1:22 PM Ian Rose ***@***.***> wrote:
No rush on this, happy to let people kick the wheels.
That being said, isn't the async cluster starting already in master (#37
<#37>)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszJgN3Zcimj6Ztsq7uXaB5AYtkdRgks5vGNKWgaJpZM4ZfVDs>
.
|
Hi, Not sure what is meant by
Unfortunately I did not have the time to test dask-labextension at all yet. I'm a little behind on this, we're still using standard Notebook with our Jupyterhub at CNES (due to a probably not really complicated problem with JLab I need to work on). This looks interesting though, I'm often launching several clusters from several Notebooks, so if I could use only one, and automatically, that would be great! I'm putting this on my todo-list, but don"t wait on me to merge... |
Hi @guillaumeeb, thanks for sharing your use-case! I'd like to hear how this feature works for you when you get the chance to try it.
This extension expects that different cluster implentations (e.g. We got a little bit ahead of ourselves by expecting that, however, since there is not a critical mass of implementers who have made cluster startup async yet. So now there is a bit of catch-up going on to make sure that works. |
Note that I got dask-kubernetes mostly working asynchronously if anyone wants to give it a shot
|
Where by "I" I actually mean "Yuvi and I" |
So I finally managed to deploy extensions into Jupyterlab behind my corporate proxy. I'm just beginning testing dask-labextension. Before I try this PR, I just wanted to know if my environment was working as intended. Typical steps I take right now are:
Using Pangeo on binder, the notebook was started with an already defined layout: how do I do this? |
Hi @guillaumeeb, this functionality is intended to work with clusters managed by the extension (rather than ones launched in your notebook). It currently works with If those preconditions are met, then all new notebooks should get a client for that cluster auto-injected into the current python kernel session (when the option turned on). |
Okay, that helps a lot to know what is needed here.
Why possibly? I personnaly maintain dask-jobqueue, which provides implementations of Cluster, not sure about the asynchronous part. Are these the only two requirements? How do we launch a cluster with the extension? I guess I should use the New button, but nothing happens when I try to click on it on my setup. Maybe I should rather open another issue to discuss the questions from my previous comment on standard dask-labextension behavior, not really related to this PR? |
I say possibly because @mrocklin and @yuvipanda just put a bunch of work into updating it, but I don't think it has been tested in this context yet.
The function to launch a new cluster is defined here: dask-labextension/dask_labextension/manager.py Lines 23 to 41 in 910f5ff
The intention is that those are the only two requirements (implements
That would be great, thanks @guillaumeeb. |
This is in. My apologies for the long delay @ian-r-rose ! |
It is unclear to me where I get the seemingly hashed Dask Dashboard URL from that is visible in above screenshots? I tried to simply set it to http://127.0.0.1:8787/status, after which all the buttons lit up, and clicking on them opened a new jlab view, but no content was displayed. I guess I'm missing something? My default browser display of the status page works fine. |
Work towards having all active notebooks and consoles get references to a client for the currently active Dask cluster. The current code injected into a kernel is