Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm dask notes seem to have a typo in setting the environment variable #4253

Open
twiecki opened this issue Nov 27, 2018 · 16 comments
Open
Labels
documentation Improve or add to documentation

Comments

@twiecki
Copy link

twiecki commented Nov 27, 2018

I followed the guide at: http://docs.dask.org/en/latest/setup/kubernetes-helm.html

Running helm install stable/dask instructs me to run:
image

to set the right env variables. Note that the variable is called DASK_SCHEDULER while dask seems to look for DASK_SCHEDULER_ADDRESS, so
image
was not working for me unless I fixed that.

@twiecki
Copy link
Author

twiecki commented Nov 27, 2018

Actually that doesn't seem to be right either as it requires tcp:// in front it seems. Is that env variable supposed to be set automatically? Because it was not.

@mrocklin
Copy link
Member

mrocklin commented Nov 27, 2018 via email

@mrocklin
Copy link
Member

mrocklin commented Nov 27, 2018 via email

@mrocklin
Copy link
Member

mrocklin commented Nov 27, 2018 via email

@twiecki
Copy link
Author

twiecki commented Nov 27, 2018

Actually I'm not quite sure I'm doing it right:
You can create a notebook and create a Dask client from there. The DASK_SCHEDULER_ADDRESS environment variable has been populated with the address of the Dask scheduler. This is available in Python in the config dictionary.

I ran ipython locally instead, it works fine when I connect to the jupyter lab server and run it there. Or is it supposed to work locally as well?

@mrocklin
Copy link
Member

You need to provide the Client object with a network address that connects to the dask-scheduler pod. Within Kubernetes the bald-eel-scheduler:8786 address should work. Outside of Kubernetes you'll need to find an address that is externally visible. This should be accessible with the helm or kubectl CLI, though I admit that the exact command escapes me at the moment.

@twiecki
Copy link
Author

twiecki commented Nov 27, 2018

if I run export DASK_SCHEDULER_ADDRESS=tcp://$DASK_SCHEDULER:8786 it works fine with just instantiating Client, although then running something fails locally then with:

In [3]: >>> def square(x):
   ...:         return x ** 2
   ...:
   ...: >>> def neg(x):
   ...:         return -x
   ...:
   ...: >>> A = client.map(square, range(10))
   ...: >>> B = client.map(neg, A)
   ...: >>> total = client.submit(sum, B)
   ...: >>> total.result()
   ...:
   ...:
---------------------------------------------------------------------------
CancelledError                            Traceback (most recent call last)
<ipython-input-3-0b952c058b1d> in <module>()
      8 B = client.map(neg, A)
      9 total = client.submit(sum, B)
---> 10 total.result()

~/anaconda3/lib/python3.6/site-packages/distributed/client.py in result(self, timeout)
    195             six.reraise(*result)
    196         elif self.status == 'cancelled':
--> 197             raise result
    198         else:
    199             return result

CancelledError: sum-5c00f435c56b61583526e7a19aa3a2bd

@mrocklin
Copy link
Member

You should check worker logs. But my guess is that you have mismatched versions between your local client library and the version used in the helm chart. You might consider running client.get_versions(check=True). If I'm right then you'll get some exception.

@twiecki
Copy link
Author

twiecki commented Nov 27, 2018

TypeError: versions() got an unexpected keyword argument 'packages', I guess that means yes? :)

@mrocklin
Copy link
Member

Yes. If you're within a decent range of versions then you'll get a nicely printed out version of which packages are out of sync. If you're mismatched enough then you get that :/

The docker image in the stable helm chart is pretty old. You might consider bumping it up with a small config file. Something like the following:

scheduler:
  image: "daskdev/dask"
  tag: 0.19.4

worker:
  image: "daskdev/dask"
  tag: 0.19.4

notebook:
  image: "daskdev/dask-notebook"
  tag: 0.19.4

Then helm upgrade bald-eagle -f my-config.yaml

It looks like there isn't a docker image newer than that up on docker hub.

@mrocklin
Copy link
Member

I've tagged a 0.20.2 on docker hub. Should be built and up in a while.

@martindurant
Copy link
Member

If you're mismatched enough then you get that :/

Presumably check_versions should be resilient to this (for the future) and give a sensible error if it failed to fetch information successfully.

@mrocklin
Copy link
Member

mrocklin commented Nov 27, 2018 via email

@twiecki
Copy link
Author

twiecki commented Nov 28, 2018

If I run helm repo update and helm install stable/dask should I get dask containers with 0.20.2? Because the version still fails.

@mrocklin
Copy link
Member

mrocklin commented Nov 28, 2018 via email

@mrocklin
Copy link
Member

mrocklin commented Nov 28, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improve or add to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants