Greedy cluster client #36

ian-r-rose · 2018-12-22T05:11:18Z

Work towards having all active notebooks and consoles get references to a client for the currently active Dask cluster. The current code injected into a kernel is

import dask; from dask.distributed import Client
dask.config.set({'scheduler-address': '${model.scheduler_address}'})
client = Client()

ian-r-rose · 2018-12-22T17:41:29Z

Fixes #35

mrocklin · 2018-12-23T18:23:09Z

Oooh, nice

mrocklin · 2018-12-23T18:24:11Z

Confirmed that this also works nicely when I restart. Pretty slick @ian-r-rose !

My guess is that we don't want this behavior on by default at first. Thoughts on the right way to manage configuration?

mrocklin · 2018-12-23T18:39:16Z

For context. @ian-r-rose and I were talking about ways to enable administrators to create clusters for users that they never have to interact with.

For example I think that the combination of this PR along with #37 will allow for Pangeo or example.dask.org users to just start with import xarray or import iris and not think about starting Dask clusters or Dask clients at all. Hopefully we can hide that boilerplate from them.

cc @jacobtomlinson @yuvipanda @jhamman

mrocklin · 2018-12-23T18:54:31Z

To be more explicit

ian-r-rose · 2018-12-28T22:14:46Z

I agree that this behavior should not be on by default. I think the best way to approach it would be via the JupyterLab settings system.

a client for the currently active Dask cluster.

selected if available.

ian-r-rose · 2018-12-29T01:28:07Z

Okay, I think this is in reasonable shape. The current behavior:

There is a setting in the JupyterLab settings system called greedyClusterClient which defaults to false. If it is true, the behavior in this PR is enabled. It should be able to handle live-changes to that setting. I am open to suggestions for better names/descriptions for this setting.
There is an item in the settings menu and command palette to toggle this new behavior (Greedy Dask Client). Again, I am looking for ideas for a better short, descriptive name for this behavior.
When a new notebook or console is created, the code for connecting to the active cluster is injected.
When a kernel restarts, the client connection code is reinjected for all open notebooks and consoles.
When a new cluster is selected, the client connection code is reinjected for all open notebooks and consoles.
I have made it so that at least one cluster in the listing is always selected.
The currently active cluster in the listing is stored in the state database, so it should be remembered upon page refresh.

jhamman · 2018-12-29T23:55:13Z

@ian-r-rose - this looks really cool. I'll try to give it a test spin next week. Also pinging @jacobtomlinson and @niallrobinson for comment since this is something they have been interested in.

ian-r-rose · 2019-01-08T19:38:36Z

@jhamman Did you get a chance to play with this?

mrocklin · 2019-01-09T20:04:04Z

I suggest the name auto-start rather than greedy.

Is there a way for us to control this in a Jupyter config file? If so, what does that config file look like?

mrocklin · 2019-01-09T20:13:20Z

Also, we can maybe drop the term Cluster from the name? Maybe add Dask (if things aren't already namespaced)

ian-r-rose · 2019-01-09T21:33:54Z

Thanks for the suggestions @mrocklin. If I understand correctly, you are suggesting something like

Name the setting autoStartClient
Label the menu item Auto-Start Dask Client

Is that right?

This is driven by the client-side, so it can't be controlled in the server-side juptyer_notebook_config.py. That setting, however, is stored as a JSON file on disk, so it can be seeded into a docker image if needed.

mrocklin · 2019-01-09T23:29:56Z

I'm questioning the use of both the terms cluster and client. These seem to be specific to how dask works. I suspect that from a novice user's perspective they just want "Dask" to start. They may not know about clusters or clients or whatnot.

ian-r-rose · 2019-01-09T23:38:01Z

Sure, that makes sense to me. The setting name is a bit more targeted towards experts and admins, so I might advocate being a more specific there and retaining "Client." The menu item is much more user-facing, so removing "Client" works for me.

Regarding including "dask": the setting name is already namespaced by being in the Dask settings, but the menu label should have "Dask" in it, as you suggest.

mrocklin · 2019-01-09T23:41:21Z

OK. That makes sense to me.

…

On Wed, Jan 9, 2019 at 3:38 PM Ian Rose ***@***.***> wrote: Sure, that makes sense to me. The setting name is a bit more targeted towards experts and admins, so I might advocate being a more specific there and retaining "Client." The menu item is much more user-facing, so removing "Client" works for me. Regarding including "dask": the setting name is already namespaced by being in the Dask settings, but the menu label should have "Dask" in it, as you suggest. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#36 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszMGfbJFlHYrfgWQ70hY0xSyau6dSks5vBn1agaJpZM4ZfVDs> .

rabernat · 2019-01-10T09:48:03Z

Ping @jbusecke for a review of this. Julius is a postdoc at Princeton and a heavy user of pangeo on HPC. He mentioned that he was interested in this extension. Perhaps he can take this PR for a spin and give some feedback.

jbusecke · 2019-01-10T23:53:58Z

Oh this looks super sweet! I will definitely try to take this for a spin on the HPCs I am working on in the next days!

jhamman · 2019-01-11T01:12:39Z

I setup a test binder with this branch today:

Question: it sounds like this feature is meant to be off by default. How to I set it to on?

ian-r-rose · 2019-01-11T02:01:56Z

Hi @jhamman, you can toggle it in the Settings menu under "Auto-Start Dask".

ian-r-rose · 2019-01-11T02:09:32Z

Although I am having a hard time with that binder link because clusters are not starting for some reason.

jhamman · 2019-01-11T04:14:13Z

The logs I'm getting on the binder is:

[W 04:12:24.249 LabApp] object KubeCluster can't be used in 'await' expression
[I 04:12:25.435 LabApp] Client sent subprotocols: ['']
[I 04:12:25.436 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n
[I 04:12:25.440 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n
[I 04:12:25.725 LabApp] Client sent subprotocols: ['']
[I 04:12:25.726 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL
[I 04:12:25.729 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL

should I be working off development versions of dask/distributed/dask-kubernetes?

mrocklin · 2019-01-11T05:06:26Z

OK, it looks like we'll have to make dask-kubernetes more async friendly. This is probably on me (or maybe @yuvipanda depending on when he gets to some async kubernetes work)

…

On Thu, Jan 10, 2019 at 8:14 PM Joe Hamman ***@***.***> wrote: The logs I'm getting on the binder is: [W 04:12:24.249 LabApp] object KubeCluster can't be used in 'await' expression [I 04:12:25.435 LabApp] Client sent subprotocols: [''] [I 04:12:25.436 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n [I 04:12:25.440 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-progress/ws?bokeh-protocol-version=1.0&bokeh-session-id=CVEoIFIvycC2DEGuFlS9jd9MsdPjBoirUFNwSAjkrT6n [I 04:12:25.725 LabApp] Client sent subprotocols: [''] [I 04:12:25.726 LabApp] Trying to establish websocket connection to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL [I 04:12:25.729 LabApp] Websocket connection established to ws://127.0.0.1:8787/individual-task-stream/ws?bokeh-protocol-version=1.0&bokeh-session-id=Ku90IbfLbhki6h46zNHhJCXdTA1Ge0Jvjzc1uVUSDxgL should I be working off development versions of dask/distributed/dask-kubernetes? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#36 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszHhQGWCGIOxx19NiUqNwcGLlFnpLks5vCA-VgaJpZM4ZfVDs> .

mrocklin · 2019-01-11T05:17:57Z

dask/dask-kubernetes#116 may work? Untested

mrocklin · 2019-01-11T05:18:12Z

You'll also need distributed master

jhamman · 2019-01-11T16:25:09Z

I've switched the above binder to use the local cluster (for now) and to use distributed/master. Things still don't seem to be working so I'll keep investigating as time allows.

you can toggle it in the Settings menu under "Auto-Start Dask"

Is it possible to set this from outside the lab environment. Ideally, I could set the default value from the start script in my binder config.

ian-r-rose · 2019-01-11T16:57:04Z

@jhamman Yes, you can set it from outside of lab. The settings file is a JSON file on disk, so you can seed the docker image with that in place (I don't think the start script should even be necessary). I'll take a crack at it in your test branch today sometime.

ian-r-rose · 2019-01-11T19:02:05Z

here is a commit that auto-sets the setting in your binder example. It seems to work well, but there is still something broken with the cluster config...

mrocklin · 2019-01-23T20:40:53Z

Just to update this PR I think that we should get async-kubernetes working in dask-kubernetes, and then revisit this.

It would also be good to test this with dask-jobqueue and dask-yarn. Mostly this makes the request that clusters can be started and stopped asynchronously. I suspect that this is already true for Dask-Jobqueue, but it would be good to test in practice (cc @guillaumeeb). I don't know the current status in dask-yarn (cc @jcrist )

ian-r-rose · 2019-01-23T21:22:30Z

No rush on this, happy to let people kick the wheels.

That being said, isn't the async cluster starting already in master (#37)?

mrocklin · 2019-01-23T21:40:28Z

Hrm, indeed...

…

On Wed, Jan 23, 2019 at 1:22 PM Ian Rose ***@***.***> wrote: No rush on this, happy to let people kick the wheels. That being said, isn't the async cluster starting already in master (#37 <#37>)? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#36 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszJgN3Zcimj6Ztsq7uXaB5AYtkdRgks5vGNKWgaJpZM4ZfVDs> .

guillaumeeb · 2019-02-03T15:07:45Z

Hi,

Not sure what is meant by

Mostly this makes the request that clusters can be started and stopped asynchronously

Unfortunately I did not have the time to test dask-labextension at all yet. I'm a little behind on this, we're still using standard Notebook with our Jupyterhub at CNES (due to a probably not really complicated problem with JLab I need to work on).

This looks interesting though, I'm often launching several clusters from several Notebooks, so if I could use only one, and automatically, that would be great! I'm putting this on my todo-list, but don"t wait on me to merge...

ian-r-rose · 2019-02-09T03:24:17Z

Hi @guillaumeeb, thanks for sharing your use-case! I'd like to hear how this feature works for you when you get the chance to try it.

Not sure what is meant by

Mostly this makes the request that clusters can be started and stopped asynchronously

This extension expects that different cluster implentations (e.g. LocalCluster or KubeCluster) can be started without blocking the main event loop. This means, for instance, that JupyterLab startup time won't be delayed by setting up any initial clusters: they can finish in their own time and be populated when they are ready.

We got a little bit ahead of ourselves by expecting that, however, since there is not a critical mass of implementers who have made cluster startup async yet. So now there is a bit of catch-up going on to make sure that works.

mrocklin · 2019-02-09T03:38:05Z

Note that I got dask-kubernetes mostly working asynchronously if anyone wants to give it a shot

pip install git+https://github.com/dask/dask-kubernetes@dev

mrocklin · 2019-02-09T03:38:22Z

Where by "I" I actually mean "Yuvi and I"

guillaumeeb · 2019-02-11T13:37:06Z

So I finally managed to deploy extensions into Jupyterlab behind my corporate proxy. I'm just beginning testing dask-labextension. Before I try this PR, I just wanted to know if my environment was working as intended. Typical steps I take right now are:

Start a cluster inside my notebook
Indicate scheduler URL in dask-labextension left panel
Open views on my lab environment

Using Pangeo on binder, the notebook was started with an already defined layout: how do I do this?
I was under the impression that once a Cluster was started, dask-labextension views were automatically connected to them, isn't that true? Is the second step above always needed?

ian-r-rose · 2019-02-11T17:55:35Z

Hi @guillaumeeb, this functionality is intended to work with clusters managed by the extension (rather than ones launched in your notebook). It currently works with LocalCluster, and (possibly) KubeCluster. Basically, the cluster implementation needs to satisfy the Cluster interface, and be possible to start asynchronously (i.e. cluster = await MyCluster(*args)). We got a bit ahead of ourselves in requiring the async startup here, so it's possible that your deployment doesn't yet fit that usage.

If those preconditions are met, then all new notebooks should get a client for that cluster auto-injected into the current python kernel session (when the option turned on).

guillaumeeb · 2019-02-11T20:03:28Z

and be possible to start asynchronously (i.e. cluster = await MyCluster(*args))

Okay, that helps a lot to know what is needed here.

and (possibly) KubeCluster

Why possibly? I personnaly maintain dask-jobqueue, which provides implementations of Cluster, not sure about the asynchronous part. Are these the only two requirements? How do we launch a cluster with the extension? I guess I should use the New button, but nothing happens when I try to click on it on my setup.

Maybe I should rather open another issue to discuss the questions from my previous comment on standard dask-labextension behavior, not really related to this PR?

ian-r-rose · 2019-02-11T20:19:50Z

and be possible to start asynchronously (i.e. cluster = await MyCluster(*args))

Okay, that helps a lot to know what is needed here.

and (possibly) KubeCluster

Why possibly?

I say possibly because @mrocklin and @yuvipanda just put a bunch of work into updating it, but I don't think it has been tested in this context yet.

I personnaly maintain dask-jobqueue, which provides implementations of Cluster, not sure about the asynchronous part. Are these the only two requirements? How do we launch a cluster with the extension? I guess I should use the New button, but nothing happens when I try to click on it on my setup.

The function to launch a new cluster is defined here:

dask-labextension/dask_labextension/manager.py

Lines 23 to 41 in 910f5ff

    
           async def make_cluster(configuration: dict) -> Cluster: 
        
               module = importlib.import_module(dask.config.get('labextension.factory.module')) 
        
               Cluster = getattr(module, dask.config.get('labextension.factory.class')) 
        
               cluster = await Cluster(*dask.config.get('labextension.factory.args'), 
        
                                       **dask.config.get('labextension.factory.kwargs'), 
        
                                       asynchronous=True) 
        
               configuration = dask.config.merge( 
        
                   dask.config.get('labextension.default'), 
        
                   configuration 
        
               ) 
        
               adaptive = None 
        
               if configuration.get('adapt'): 
        
                   adaptive = cluster.adapt(**configuration.get('adapt')) 
        
               elif configuration.get('workers') is not None: 
        
                   cluster.scale(configuration.get('workers')) 
        
               return cluster, adaptive

The intention is that those are the only two requirements (implements Cluster and is awaitable), but we likely have some kinks to work out to make it widely usable. Are there any errors in the notebook logs when you click the "New" button?

Maybe I should rather open another issue to discuss the questions from my previous comment on standard dask-labextension behavior, not really related to this PR?

That would be great, thanks @guillaumeeb.

mrocklin · 2019-02-26T02:07:58Z

This is in. My apologies for the long delay @ian-r-rose !

michaelaye · 2019-05-28T06:03:36Z

It is unclear to me where I get the seemingly hashed Dask Dashboard URL from that is visible in above screenshots? I tried to simply set it to http://127.0.0.1:8787/status, after which all the buttons lit up, and clicking on them opened a new jlab view, but no content was displayed. I guess I'm missing something? My default browser display of the status page works fine.

ian-r-rose mentioned this pull request Dec 28, 2018

Add initial cluster parameters and instances #37

Merged

ian-r-rose added 6 commits December 28, 2018 17:25

Add logic for consoles and notebooks to automatically get a reference to

3d7c373

a client for the currently active Dask cluster.

Reuse logic between notebook and console trackers.

0d7b4d6

Move utility function to Private namespace.

2e0dca8

Allow the greedy cluster client injection to depend on the a setting.

eaf4c32

Store active cluster in the state database, and make sure one is always

67cee4e

selected if available.

Add a toggle for the greedy client to the main menu and command palette.

09c4b37

ian-r-rose force-pushed the greedy-cluster-client branch from cb5f6bc to 09c4b37 Compare December 29, 2018 01:29

ian-r-rose changed the title ~~[WIP] Greedy cluster client~~ Greedy cluster client Dec 29, 2018

ian-r-rose mentioned this pull request Dec 30, 2018

Automatically connect to new dashboard #26

Open

Rename Greedy to Auto-Start

9b9fa34

mrocklin mentioned this pull request Jan 11, 2019

Make KubeCluster awaitable dask/dask-kubernetes#116

Closed

guillaumeeb mentioned this pull request Feb 11, 2019

What is standard dask-labextension behavior and usage #48

Closed

mrocklin merged commit 6641044 into dask:master Feb 26, 2019

ian-r-rose mentioned this pull request Mar 7, 2019

Automatically create client to current cluster in the background #35

Closed

aidanheerdegen mentioned this pull request May 22, 2019

PBSCluster is not awaitable dask/dask-jobqueue#273

Closed

Greedy cluster client #36

Greedy cluster client #36

Conversation

ian-r-rose commented Dec 22, 2018

ian-r-rose commented Dec 22, 2018

mrocklin commented Dec 23, 2018

mrocklin commented Dec 23, 2018

mrocklin commented Dec 23, 2018

mrocklin commented Dec 23, 2018

ian-r-rose commented Dec 28, 2018

ian-r-rose commented Dec 29, 2018

jhamman commented Dec 29, 2018

ian-r-rose commented Jan 8, 2019

mrocklin commented Jan 9, 2019

mrocklin commented Jan 9, 2019

ian-r-rose commented Jan 9, 2019 • edited Loading

mrocklin commented Jan 9, 2019

ian-r-rose commented Jan 9, 2019

mrocklin commented Jan 9, 2019 via email

rabernat commented Jan 10, 2019

jbusecke commented Jan 10, 2019

jhamman commented Jan 11, 2019

ian-r-rose commented Jan 11, 2019

ian-r-rose commented Jan 11, 2019

jhamman commented Jan 11, 2019

mrocklin commented Jan 11, 2019 via email

mrocklin commented Jan 11, 2019

mrocklin commented Jan 11, 2019

jhamman commented Jan 11, 2019

ian-r-rose commented Jan 11, 2019

ian-r-rose commented Jan 11, 2019

mrocklin commented Jan 23, 2019

ian-r-rose commented Jan 23, 2019

mrocklin commented Jan 23, 2019 via email

guillaumeeb commented Feb 3, 2019

ian-r-rose commented Feb 9, 2019

mrocklin commented Feb 9, 2019

mrocklin commented Feb 9, 2019

guillaumeeb commented Feb 11, 2019 • edited Loading

ian-r-rose commented Feb 11, 2019

guillaumeeb commented Feb 11, 2019

ian-r-rose commented Feb 11, 2019

mrocklin commented Feb 26, 2019

michaelaye commented May 28, 2019

ian-r-rose commented Jan 9, 2019 •

edited

Loading

guillaumeeb commented Feb 11, 2019 •

edited

Loading