Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create and maintain ray images in AICoE #2

Open
pacospace opened this issue Mar 17, 2021 · 11 comments
Open

create and maintain ray images in AICoE #2

pacospace opened this issue Mar 17, 2021 · 11 comments

Comments

@pacospace
Copy link

pacospace commented Mar 17, 2021

Hi @erikerlandson,

what do you think if thoth can automatically create and maintain these images with their dependencies for you using thoth pipelines and bots?

Similar to what we do for ODH images for example: https://quay.io/repository/thoth-station/s2i-lab-elyra?tab=tags, which is crated from: https://github.com/opendatahub-io/s2i-lab-elyra.

@erikerlandson
Copy link
Owner

@pacospace yes, building them with thoth was on my roadmap. I had a couple questions.

  1. currently the ray libraries are installed from nightly-build wheels (not pypi) - will thoth work around that?
  2. two of my 3 images are NOT notebook images - they are the ray worker-node image and the ray operator image: they are not part of the s2i-xxx image lineage. Does thoth also build these? (would be analogous to building spark worker images)

@pacospace
Copy link
Author

pacospace commented Mar 17, 2021

@pacospace yes, building them with thoth was on my roadmap. I had a couple questions.

  1. currently the ray libraries are installed from nightly-build wheels (not pypi) - will thoth work around that?

can I ask why nightly-build wheels are used? are there some specific features in https://github.com/erikerlandson/ray-odh-jupyter/blob/44b055935c965219559b7840b23c02c9438bd385/images/ray-minimal-notebook/requirements.txt#L1 not available in the stable release of ray from PyPI?

  1. two of my 3 images are NOT notebook images - they are the ray worker-node image and the ray operator image: they are not part of the s2i-xxx image lineage. Does thoth also build these? (would be analogous to building spark worker images)

aicoe pipelines can also take care of these cases, but thoth cannot advise on these specifically currently.

Do you want to migrate to one AICoE repo (all thoth services are already set there) and we can start with one repo restructuring Dockerfile in a way that thoth can provide services for you? (using Pipfile/Pipfile.lock and micropipenv)? wdyt?

cc @harshad16

@erikerlandson
Copy link
Owner

can I ask why nightly-build wheels are used?

Previous versions of Ray would only allow a connection to the ray cluster from the physical head node. Ray 2.0 allows remote connections, which are what allow a jupyter notebook to connect to a ray cluster. However, Ray 2.0 is under development, and is purely head-of-dev-branch.

@erikerlandson
Copy link
Owner

cross-reference: the repo where I build the ray notebook images (ray-ml-notebook is what is currently installed on MOC)
https://github.com/erikerlandson/ray-odh-jupyter

@erikerlandson
Copy link
Owner

Elyra seems to be getting a lot of traction. Would it be useful to create ray-enabled elyra images?

@pacospace
Copy link
Author

pacospace commented Mar 18, 2021

Elyra seems to be getting a lot of traction. Would it be useful to create ray-enabled elyra images?

Using that image I would use Ray for all my notebooks? As Data Scientist I want to use Ray in one of my step maybe, not by default for all notebooks, wdyt?

As data scientists,

I want to run hyperparameter tuning in an AI pipeline.

Currently, there is no way to do that directly from Elyra: elyra-ai/elyra#646, you can run a pipeline from a pipeline using kfp libraries for example but still not yet the best if integrated with Elyra currently. But I'm trying to have that feature upstream.

AI Pipeline in Elyra uses Kubeflow engine (Argo or Tekton) or Airflow engine and each step requires a base image, resources, env variables and notebook.

I think more than having ray enabled Elyra image, the question is can we run one step in a pipeline that has a notebook that requires Ray? (hyperparameter tuning or distributed training or RL for example).

@erikerlandson
Copy link
Owner

It depends on how one wants to work. If elyra is being used only for managing pipelines, then Ray might not be very useful. If it is being used for data science explorations, and the data scientist wants ray available as backing compute, then maybe. I think a workflow where granular pipelines are created and each node in the pipline is a notebook is relevant - some such nodes might want ray but not others. A third possible modality is standing up a single ray cluster and having multiple nodes connect to it.

@erikerlandson
Copy link
Owner

Yet another image variation would be jupyter-lab images, as opposed with "traditional" jupyter-hub images. Currently, I'm most interested in stand-alone explorations, but if ray gets traction I would expect it to appear in larger pipelining contexts.

@pacospace
Copy link
Author

It depends on how one wants to work.

Agree.

If elyra is being used only for managing pipelines, then Ray might not be very useful. If it is being used for data science explorations, and the data scientist wants ray available as backing compute, then maybe.

In theory it can be used for both, because Elyra is just an extension to Jupyterlab.

I think a workflow where granular pipelines are created and each node in the pipline is a notebook is relevant - some such nodes might want ray but not others.

A third possible modality is standing up a single ray cluster and having multiple nodes connect to it.

@pacospace
Copy link
Author

It depends on how one wants to work.

Agree.

If elyra is being used only for managing pipelines, then Ray might not be very useful. If it is being used for data science explorations, and the data scientist wants ray available as backing compute, then maybe.

In theory it can be used for both, because Elyra is just an extension to Jupyterlab.

One correction here, maybe the ray as backing computing is the thing more interesting from the Elyra image point of view, you can submit a notebook with the base image you want, not only a complete AI pipeline.

I think a workflow where granular pipelines are created and each node in the pipline is a notebook is relevant - some such nodes might want ray but not others.

A third possible modality is standing up a single ray cluster and having multiple nodes connect to it.

@pacospace
Copy link
Author

Related-To: thoth-station/core#283

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants