Skip to content

Conversation

@zonca
Copy link
Contributor

@zonca zonca commented Apr 28, 2022

Numpy is not a requirement either of dask or dask-gateway, however, dask.array won't work without numpy, it would be really convenient to have that available in the default dask-gateway image.

@martindurant
Copy link
Member

From the head of that file:

# The admin installing the dask-gateway Helm chart or its end users are meant to
# specify an image for the scheduler and worker pods to use that meets their
# needs for the Dask clusters they startup. Please build your own according to
# the documentation if this very limited image doesn't meet your needs.

The current dockerfile is, on purpose, a very limited one for demonstration and testing only. It would be rather difficult to find a "best" image to suit all needs.

@jcrist
Copy link
Member

jcrist commented Apr 28, 2022

I agree that we should keep this image minimal, but I do think having a working dask array (and probably dask dataframe?) would be useful for demos and testing things out. A new user is likely to try with the default images to start, having a functional demo without requiring further changes is something I think users can expect.

@zonca
Copy link
Contributor Author

zonca commented Apr 28, 2022

I agree that we should keep this image minimal, but I do think having a working dask array (and probably dask dataframe?) would be useful for demos and testing things out. A new user is likely to try with the default images to start, having a functional demo without requiring further changes is something I think users can expect.

yes, in fact this is related to my tutorial about deploying Dask Gateway on Jetstream 2: https://zonca.dev/2022/04/dask-gateway-jupyterhub.html

Related to this, it would be nice to have bokeh as well, dask without the dashboard is not the same thing! ;)

@consideRatio
Copy link
Collaborator

I agree with the value of providing a image for slightly more than our test suite requires if it means that an end user with a fresh Helm chart installation can do a basic test involving:

  • Starting a cluster via the dask-gateway client
  • Running some kind of very primitive work, for example using numpy only or perhaps numpy and pandas if that is sufficient for a basic example
  • Looking at the dashboard provided by the scheduler, which require bokeh.

I think its crucial that we define this kind of test very specifically. Without it, there is no clear line we draw about what we are to provide in the image, and this image will grow without a clear boundary and become vague in its purpose. And this I fear is problematic both for maintainers and anyone being aware about this image.

I'm not an active user of dask workers etc, if we can couple this change with a change involving a very primitive example, I'd be happy to merge this change as well.

End to end test with jupyterhub + dask-gateway

Note that if someone would want to make an end to end test of a dask-gateway + jupyterhub setup, I would recommend instead configuring the user image for JupyterHub, and then relying on configuration of the dask-gateway client to pass that image to be used by the scheduler and worker pod.

Click to expand: example configuration for re-using jupyterhub image
jupyterhub:
  singleuser:
    image:
      name: pangeo/pangeo-notebook
      tag: 2022.04.20
    extraEnv:
      # About DASK_ prefixed variables we set:
      #
      # 1. k8s native variable expansion is applied with $(MY_ENV) syntax. The
      #    order variables are defined matters though and we are under the
      #    mercy of how KubeSpawner renders our passed dictionaries.
      #
      # 2. Dask loads local YAML config.
      #
      # 3. Dask loads environment variables prefixed DASK_.
      #    - DASK_ is stripped
      #    - Capitalization is ignored
      #    - Double underscore means a nested configuration
      #    - `ast.literal_eval` is used to parse values
      #
      # 4. dask-gateway and dask-distributed looks at its config and expands
      #    expressions in {} again, sometimes only with the environment
      #    variables as context but sometimes also with additional variables.
      #
      # References:
      # - K8s expansion:     https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/
      # - KubeSpawner issue: https://github.com/jupyterhub/kubespawner/issues/491
      # - Dask config:       https://docs.dask.org/en/latest/configuration.html
      # - Exploration issue: https://github.com/2i2c-org/infrastructure/issues/442
      #
      # DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE makes the default worker image
      # match the singleuser image.
      DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: "{JUPYTER_IMAGE_SPEC}"
      # DASK_DISTRIBUTED__DASHBOARD_LINK makes the suggested link to the
      # dashboard account for the /user/<username> prefix in the path. Note
      # that this still misbehave if you have a named server but now its at
      # least functional for non-named servers.
      DASK_DISTRIBUTED__DASHBOARD_LINK: "/user/{JUPYTERHUB_USER}/proxy/{port}/status"

dask-gateway:
  gateway:
    extraConfig:
      optionHandler: |
        from dask_gateway_server.options import Options, Integer, Float, String, Mapping

        def cluster_options(user):
            def option_handler(options):
                if ":" not in options.image:
                    raise ValueError("When specifying an image you must also provide a tag")
                return {
                    "worker_cores_limit": options.worker_cores,
                    "worker_cores": min(options.worker_cores / 2, 1),
                    "worker_memory": "%fG" % options.worker_memory,
                    "image": options.image,
                    "environment": options.environment,
                }
            return Options(
                Integer("worker_cores", 2, min=1, label="Worker Cores"),
                Float("worker_memory", 4, min=1, label="Worker Memory (GiB)"),
                # The default image is set via DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE env variable
                String("image", label="Image"),
                Mapping("environment", {}, label="Environment Variables"),
                handler=option_handler,
            )
        c.Backend.cluster_options = cluster_options

@martindurant
Copy link
Member

I think the answer is "no", but it's still probably worth asking: is there any benefit to having both the barebones image as an example and for testing and also a more fully-fledged version that builds off it, containing the most commonly used extra dependencies? There are a couple of examples in https://github.com/dask/dask-docker and even bigger ones in e.g., https://github.com/pangeo-data/pangeo-docker-images . It would be (slightly) nice to have some kind of consistency.

@delgadom
Copy link
Contributor

delgadom commented May 20, 2022

my 2¢ as another user - my biggest fear with increasing the scope of the "default" is if tests/features start leaking into the package that would break in environments without numpy/pandas/etc. For example, we've used dask.gateway to run Octave models by building a minimal python environment that can support a dask worker, installing additional modules like octave and oc2py, then using client.map to run the models on the workers. So that might point to your idea @martindurant? If numpy/pandas became an explicit dependency I don't think this would be the end of the world either, but it's definitely helpful to be able to install something extremely lightweight and then have flexibility to install radically different things (e.g. with different compiler flags/etc than those allowed, by, say, geopandas or other elements of the normal pangeo stack).

@consideRatio
Copy link
Collaborator

@delgadom note that the choices here are not related to dask-gateway the Python package.

This change is solely what is installed in a docker image referenced the Helm chart by default for scheduler/worker containers. The image is supposed to be configured explicitly by the user of the Helm chart, so what it includes is related mainly to testing of the Helm chart, and with this PR, it would also be something that can be used for a primitive demo of some kind.

I'd like such primitive demo example is provided along with this PR, for example to be executed from within a container based on the same image as that would ensure matching versions across the dask-gateway client, scheduler, and workers.

@zonca
Copy link
Contributor Author

zonca commented May 20, 2022

@consideRatio what about the same test showed in the "Usage" docs? https://gateway.dask.org/usage.html#run-computations-on-the-cluster

# First create a dask cluster and get its Client instance https://gateway.dask.org/usage.html
import dask.array as da
a = da.random.normal(size=(1000, 1000), chunks=(500, 500))
a.mean().compute()

should we have another test with pandas?

zonca and others added 3 commits May 20, 2022 15:49
Numpy is not a requirement either of `dask` or `dask-gateway`, however, `dask.array` won't work without `numpy`
bokeh for the dashboard, pandas for dask.dataframe
Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>
@zonca
Copy link
Contributor Author

zonca commented May 20, 2022

rebased on current main

@consideRatio
Copy link
Collaborator

@consideRatio what about the same test showed in the "Usage" docs? https://gateway.dask.org/usage.html#run-computations-on-the-cluster

# First create a dask cluster and get its Client instance https://gateway.dask.org/usage.html
import dask.array as da
a = da.random.normal(size=(1000, 1000), chunks=(500, 500))
a.mean().compute()

If bokeh and numpy is whats required besides whats already installed to get this test computation functional together with a scheduler dashboard, I suggest we don't expand to include pandas as well. I think referencing that example is perfect!

@consideRatio consideRatio changed the title Install numpy in dask gateway docker image Install bokeh and numpy in the Helm chart's scheduler and worker sample image Jun 10, 2022
Copy link
Collaborator

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is acceptable to me, what do you think @zonca?

Tests are broken for unrelated reasons, but fixed in #573.

@zonca
Copy link
Contributor Author

zonca commented Jun 10, 2022

Agreed, thanks!

@consideRatio consideRatio merged commit 329a82d into dask:main Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants