-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Introducing Ray AI Runtime #22488
Comments
In my opinion this is a great development and needed since some time, where the possibilities with ray are growing. I am using mostly RLlib, which I think is a great workhorse in RL and there is nothing comparable so far on the market. There it is often unclear what the standard way (or the intended way) is how things should be solved in code. Combining the APIs in a standardized manner would bring light to developers that have to think about the architecture of their solutions on top of ray. Furthermore, I think it is a good idea to outsource many functionalities into a certain part of ray (e.g. train, tune for optimizing) such that it can be focused on the main peculiarities of some modules. RLlib can so focus on the RL part of collecting experiences and constructing modularity in RL algorithms, whereas the model training can be maintained in tune/train. |
cc @gjoliver @sven1977 @avnishn on @simonsays1980's feedback! |
Update: Initial documentation can be found here.
Hi all!
I'd like to gather some feedback on a proposal to create the "Ray AI Runtime."
Ray AI Runtime (Ray AIR) features a scalable and unified toolkit for building end-to-end ML applications. By leveraging Ray and its library ecosystem, it brings scalability and programmability to ML platforms.
Our long term vision with AIR is to own the compute story for ML and AI applications: to be the one stop shop for AI compute. AIR is designed to interoperate with other systems for storage and metadata needs, and to provide standard integration points for 3rd party libraries, so that integrators with Ray get a network effect.
Overview
Ray AIR consists of 5 key components that already exist in Ray today -- Data processing (Ray Data), Model Training (Ray Train), Reinforcement Learning (Ray RLlib), Hyperparameter Tuning (Ray Tune), and Model Serving (Ray Serve). Users can use these libraries interchangeably to scale different parts of their ML workflows.
Ray AIR introduces a unified API for seamless integration across the ecosystem of Ray libraries -- enabling you to pass data and models seamlessly between data processing, training, tuning, and inference (online and offline). If you are already using Ray, this will not break backwards compatibility.
You can run applications that use these components on your laptop, and scale out to K8s/AWS/GCP/Azure without any changes to your code.
Note that we continue to invest in making each of Ray's libraries best in class on their own (e.g., Serve, RLlib, Tune, etc.). Ray AIR is improving API compatibility between Ray's existing libraries as well as providing a reference architecture for ML platform use cases.
What is Ray AI Runtime not? As its name implies, the focus is on the compute-intensive portions of the stack and not storage and metadata services. However, we will provide integrations with data sources and metadata registries like MLFlow and WandB.
Please provide feedback below or on the linked proposal!
Proposal Link
The text was updated successfully, but these errors were encountered: