Key Concepts

Here, we cover the main concepts in AIR.

Datasets

Ray Data <data> is the standard way to load and exchange data in Ray AIR. It provides a Dataset <dataset_concept> concept which is used extensively for data loading, preprocessing, and batch inference.

Preprocessors

Preprocessors are primitives that can be used to transform input data into features. Preprocessors operate on Datasets <dataset_concept>, which makes them scalable and compatible with a variety of datasources and dataframe libraries.

A Preprocessor is fitted during Training, and applied at runtime in both Training and Serving on data batches in the same way. AIR comes with a collection of built-in preprocessors, and you can also define your own with simple templates.

See the documentation on Preprocessors <air-preprocessor-ref>.

doc_code/air_key_concepts.py

Trainers

Trainers are wrapper classes around third-party training frameworks such as XGBoost and Pytorch. They are built to help integrate with core Ray actors (for distribution), Ray Tune, and Ray Data.

See the documentation on Trainers <air-trainers>.

doc_code/air_key_concepts.py

Trainer objects produce a Result <air-results-ref> object after calling .fit(). These objects contain training metrics as well as checkpoints to retrieve the best model.

doc_code/air_key_concepts.py

Tuner

Tuners <air-tuner-ref> offer scalable hyperparameter tuning as part of Ray Tune <tune-main>.

Tuners can work seamlessly with any Trainer but also can support arbitrary training functions.

doc_code/air_key_concepts.py

Checkpoints

The AIR trainers, tuners, and custom pretrained model generate a framework-specific Checkpoint<ray.air.Checkpoint> object. Checkpoints are a common interface for models that are used across different AIR components and libraries.

There are two main ways to generate a checkpoint.

Checkpoint objects can be retrieved from the Result object returned by a Trainer or Tuner .fit() call.

doc_code/air_key_concepts.py

You can also generate a checkpoint from a pretrained model. Each AIR supported machine learning (ML) framework has a Checkpoint object that can be used to generate an AIR checkpoint:

doc_code/air_key_concepts.py

Checkpoints can be used to instantiate a Predictor, BatchPredictor, or PredictorDeployment classes, as seen below.

Batch Predictor

You can take a checkpoint and do batch inference using the BatchPredictor object.

doc_code/air_key_concepts.py

Deployments

Deploy the model as an inference service by using Ray Serve and the PredictorDeployment class.

doc_code/air_key_concepts.py

After deploying the service, you can send requests to it.

doc_code/air_key_concepts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

key-concepts.rst

key-concepts.rst

Key Concepts

Datasets

Preprocessors

Trainers

Tuner

Checkpoints

Batch Predictor

Deployments

Files

key-concepts.rst

Latest commit

History

key-concepts.rst

File metadata and controls

Key Concepts

Datasets

Preprocessors

Trainers

Tuner

Checkpoints

Batch Predictor

Deployments