Dask lab extension unable to launch new clusters (2019.05.19) #48

scottyhq opened this issue May 21, 2019 · 21 comments

The most recent images did away with pinning most versions:

Unfortunately in running these new images, the dask labextension is no longer able to launch new KubeClusters (you can select the latest image to run on this hub:

Seeing messages such as these:

Failed to load resource: the server responded with a status of 500 ()
clusters.js:146 Uncaught (in promise) Error: Failed to start Dask cluster
    at DaskClusterManager.<anonymous> (clusters.js:146)
    at (<anonymous>)
    at fulfilled (clusters.js:4)
serverconnection.js:192 PUT 500
handleRequest @ serverconnection.js:192
makeRequest @ serverconnection.js:75
(anonymous) @ clusters.js:144
(anonymous) @ clusters.js:7
push.eY2S.__awaiter @ clusters.js:3
_launchCluster @ clusters.js:143
onClick @ clusters.js:66
handleMouseDown @ toolbar.js:332
callCallback @ react-dom.development.js:100
invokeGuardedCallbackDev @ react-dom.development.js:138
invokeGuardedCallback @ react-dom.development.js:187
invokeGuardedCallbackAndCatchFirstError @ react-dom.development.js:201
executeDispatch @ react-dom.development.js:461
executeDispatchesInOrder @ react-dom.development.js:483
executeDispatchesAndRelease @ react-dom.development.js:581
executeDispatchesAndReleaseTopLevel @ react-dom.development.js:592
forEachAccumulated @ react-dom.development.js:562
runEventsInBatch @ react-dom.development.js:723
runExtractedEventsInBatch @ react-dom.development.js:732
handleTopLevel @ react-dom.development.js:4477
batchedUpdates$1 @ react-dom.development.js:16660
batchedUpdates @ react-dom.development.js:2131
dispatchEvent @ react-dom.development.js:4556
interactiveUpdates$1 @ react-dom.development.js:16715
interactiveUpdates @ react-dom.development.js:2150
dispatchInteractiveEvent @ react-dom.development.js:4533
clusters.js:146 Uncaught (in promise) Error: Failed to start Dask cluster
    at DaskClusterManager.<anonymous> (clusters.js:146)
    at (<anonymous>)
    at fulfilled (clusters.js:4)

And here is a copy of the full conda environment installed:

JupyterLab v0.35.6
Known labextensions:
   app dir: /srv/conda/envs/notebook/share/jupyter/lab
        @jupyter-widgets/jupyterlab-manager v0.38.1  enabled  OK
        @jupyterlab/hub-extension v0.12.0  enabled  OK
        @pyviz/jupyterlab_pyviz v0.7.2  enabled  OK
        dask-labextension v0.3.0  enabled  OK
        jupyter-leaflet v0.10.2  enabled  OK

pinging @ian-r-rose @jhamman

@scottyhq Do you have access to the server logs? Do they have any interesting information in them?

I'd still be wary about tornado 6...

we've pinned to tornado 5.1.1. I see the following message in the log: [W 2019-05-21 23:04:04.105 SingleUserLabApp handlers:620] object KubeCluster can't be used in 'await' expression

Copy link

Hmm, I'm not sure. @mrocklin does this look familiar? I wonder if it's a regression in dask-kubernetes.

Copy link
Member Author

Actually... i'm suspicious this is something to do with the a new version of repo2docker and some mixing of environments (#47). Because if I list the packages in the 'base' environment we have the following (including tornado 6.0.2):

My first guess would be JupyterLab + Tornado 6 conflicts. I don't think that dask-kubernetes has changed a ton recently, but @jhamman might know more.

dask-kubernetes had a release four days ago. Worth checking, I think.

Fair point

@ian-r-rose - given multiple conda environments on a jupyterhub, which does dask labextension use by default?

Copy link

Whichever one is used to launch JupyterLab, I think.

Copy link

jhamman commented May 21, 2019

@scottyhq - can you try with a LocalCluster and see if that works? That will help determine if its in KubeCluster or not.

Just to clarify, launching a KubeCluster programmatically works, and I can use the 'search' glass to find it and activate all buttons. It is the 'clusters +new' part that is non-responsive. If anyone wants to enter the hub and explore further, see the hub link in the first comment.

import dask
from dask_kubernetes import KubeCluster
from dask.distributed import Client
from dask.distributed import wait, progress
cluster = KubeCluster(n_workers=2)

Copy link

mrocklin commented May 21, 2019 via email

Copy link

Has that not been merged/published yet?

Copy link

jhamman commented May 22, 2019

Has that not been merged/published yet?

No, its still sitting in the dev brach.


This binder does a few things:

  • points to the dev branch of dask_kubernetes
  • pins tornado to version 5
  • pins dask-labextension to 0.3.0

Full specs here:

Copy link

Is there anything that should be done to move forwards with an async-aware dask-kubernetes? I hadn't realized we were still blocking on that.

Copy link

There are a couple things to do, yes. Mostly it needs to be used and bugs need to be found and fixed. I plan to write up the state of things and a few possible plans as an issue later this week.

Copy link

@mrocklin I should have time to push on async KubeCluster (not this week, but probably next week). Is that something you'd like me to take on?

Copy link

mrocklin commented Jun 4, 2019 via email

Copy link

If it will take a significant amount of time, we can also back out the changes that make dask-labextension expect an asynchronously started cluster.

Copy link

quasiben commented Jun 4, 2019

@TomAugspurger there are two PRs for async kube if you are interested:

I would suggest looking at @mrocklin 's before mine.

Copy link
Member Author

noting that once upcoming async changes to dask-kubernetes are merged and released we should bump to dask-labextension > 1.0


