Tip
AIR is currently in beta. Fill out this short form to get involved. We'll be holding office hours, development sprints, and other activities as we get closer to the GA release. Join us!
Ray AI Runtime (AIR) is a scalable and unified toolkit for ML applications. AIR enables simple scaling of individual workloads, end-to-end workflows, and popular ecosystem frameworks, all in just Python.
AIR builds on Ray's best-in-class libraries for Preprocessing <datasets>
, Training <train-docs>
, Tuning <tune-main>
, Scoring <air-predictors>
, Serving <rayserve>
, and Reinforcement Learning <rllib-index>
to bring together an ecosystem of integrations.
Ray AIR aims to simplify the ecosystem of machine learning frameworks, platforms, and tools. It does this by leveraging Ray to provide a seamless, unified, and open experience for scalable ML:
1. Seamless Dev to Prod: AIR reduces friction going from development to production. With Ray and AIR, the same Python code scales seamlessly from a laptop to a large cluster.
2. Unified ML API: AIR's unified ML API enables swapping between popular frameworks, such as XGBoost, PyTorch, and HuggingFace, with just a single class change in your code.
3. Open and Extensible: AIR and Ray are fully open-source and can run on any cluster, cloud, or Kubernetes. Build custom components and integrations on top of scalable developer APIs.
AIR is for both data scientists and ML engineers alike.
For data scientists, AIR can be used to scale individual workloads, and also end-to-end ML applications. For ML Engineers, AIR provides scalable platform abstractions that can be used to easily onboard and integrate tooling from the broader ML ecosystem.
Below, we walk through how AIR's unified ML API enables scaling of end-to-end ML workflows, focusing on a few of the popular frameworks AIR integrates with (XGBoost, Pytorch, and Tensorflow). The ML workflow we're going to build is summarized by the following diagram:
AIR provides a unified API for the ML ecosystem. This diagram shows how AIR enables an ecosystem of libraries to be run at scale in just a few lines of code.Get started by installing Ray AIR:
pip install -U "ray[air]"
# The below Ray AIR tutorial was written with the following libraries.
# Consider running the following to ensure that the code below runs properly:
pip install -U pandas>=1.3.5
pip install -U torch>=1.12
pip install -U numpy>=1.19.5
pip install -U tensorflow>=2.6.2
pip install -U pyarrow>=6.0.1
First, let's start by loading a dataset from storage:
examples/xgboost_starter.py
Then, we define a Preprocessor
pipeline for our task:
XGBoost
examples/xgboost_starter.py
Pytorch
examples/pytorch_tabular_starter.py
Tensorflow
examples/tf_tabular_starter.py
Train a model with a Trainer
with common ML frameworks:
XGBoost
examples/xgboost_starter.py
Pytorch
examples/pytorch_tabular_starter.py
Tensorflow
examples/tf_tabular_starter.py
You can specify a hyperparameter space to search over for each trainer:
XGBoost
examples/xgboost_starter.py
Pytorch
examples/pytorch_tabular_starter.py
Tensorflow
examples/tf_tabular_starter.py
Then use the Tuner
to run the search:
examples/pytorch_tabular_starter.py
Use the trained model for scalable batch prediction with a BatchPredictor
.
XGBoost
examples/xgboost_starter.py
Pytorch
examples/pytorch_tabular_starter.py
Tensorflow
examples/tf_tabular_starter.py
AIR is currently in beta. If you have questions for the team or are interested in getting involved in the development process, fill out this short form.
For an overview of the AIR libraries, ecosystem integrations, and their readiness, check out the latest AIR ecosystem map <air-ecosystem-map>
.
air-key-concepts
air-examples-ref
API reference <air-api-ref>
Technical whitepaper <whitepaper>