-
Couldn't load subscription status.
- Fork 10
Separate dockerfiles for dask & distributed #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @raydouglass! This looks great. I left a few questions, but they're not meant to be blocking
Since we're now using separate images, we might want to update the dask dockerfile to use https://raw.githubusercontent.com/dask/dask/main/continuous_integration/environment-$PYTHON_VER.yaml (e.g. point to *-3.8.yaml instead of *-3.8-dev.yaml)
| RUN gpuci_mamba_retry env create -n dask --file /dask_environment.yaml | ||
|
|
||
| RUN mamba install -y -n dask -c rapidsai -c rapidsai-nightly -c nvidia -c conda-forge \ | ||
| RUN gpuci_mamba_retry install -y -n dask -c rapidsai -c rapidsai-nightly -c nvidia -c conda-forge \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add dask-cuda here too (xref dask/dask#7966 (review))? I'm not sure if the latest dask-cuda release is what we'd like, or if we need the latest dev version. cc @pentschev @charlesbluca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the error you described, it sounds like the package we would want here is dask_cudf, I'm assuming whatever version corresponds with the version of cudf we have in the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @charlesbluca, my mistake, I meant dask-cudf
I'm assuming whatever version corresponds with the version of cudf we have in the image.
Alright, in that case I think we'd want to add
dask-cudf=$RAPIDS_VER \
here then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does distributed also need dask-cudf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it needs it right now. But we might start using that in the future. At moment we mostly test cudf's serializers. cc @pentschev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, for now cudf should suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with one question:
|
Nice, thanks @raydouglass! Does merging this PR kick off a new docker build, or are the nightly builds the only time we rebuild? |
Separate out the dockerfiles for both dask & distributed.
Also fixes passing
UCX_PY_VERto the docker build.