# Introduction to Ray AI Runtime (AIR)

<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Generic/ray_logo.png" width="20%" loading="lazy">

## About this notebook

### Is this module right for you?

In this module, you will learn how to use Ray AIR to build an end-to-end machine learning application. It covers the entire process from data loading to training and hyperparameter tuning to prediction and serving. Along the way, each section will introduce you to key components of Ray AIR and provide hands-on coding exercises to demonstrate usage.

The ideal learner will have the following attributes:

* A basic understanding of the Ray project.
* An interest in learning about Ray AIR, the unified API for scalable ML applications.
* An existing ML application or workload that requires a simple scaling solution.

### Prerequisites

For this notebook, you should satisfy the following requirements:

* Practical Python and machine learning experience.
* Basic familiarity with Ray, equivalent to completing this module:
    * [Overview of Ray](https://github.com/ray-project/ray-educational-materials/blob/main/Introductory_modules/Overview_of_Ray.ipynb)

### Learning objectives

* Understand the high-level ML libraries that make up Ray AIR.
    * Data
    * Train
    * Tune
    * Serve
    * RLlib (not covered in-depth in this module)
* Use Ray AIR as a unified toolkit to write an end-to-end ML applications.
* Extend mini-examples presented here to scale individual workloads in Python with Ray AIR.
* Identify the problems and challenges that Ray AIR attempts to solve.

### What will you do?

Through hands-on exercises, you will practice the key concepts from each stage of an example ML workflow.

|ML workflow stage|Ray AIR key concept|
|:--|:--|
|Data loading and preprocessing|`Preprocessor` to load and transform data|
|Model training|`Trainer` for supported ML frameworks (Keras, Pytorch and more)|
|Hyperparameter tuning|`Tuner` for hyperparameter search|
|Batch prediction|`BatchPredictor` to load model from best checkpoint for batch inference|
|Model serving|`PredictorDeployment` for online inference|

## Part 1: Overview of Ray AI Runtime (AIR)

<div class="alert alert-info">
  <strong><a href="https://docs.ray.io/en/latest/ray-air/getting-started.html" target="_blank">Ray AI Runtime (AIR)</a></strong> is an open-source, Python-based, domain-specific library that equips ML engineers, data scientists, and researchers with a scalable and unified toolkit for ML applications.
</div>

Built on top of Ray Core, Ray AIR inherits all the performance and scalability benefits offered by Core while providing a convenient abstraction layer for machine learning. Ray AIR's Python-first native libraries allow ML practitioners to distribute individual workloads, end-to-end applications, and build custom use cases in a unified framework.

### Machine learning workflow with Ray AIR

Ray AIR wraps five native Ray libraries that scale a specific stage of the ML workflow. In addition, Ray AIR brings together an ever-growing ecosystem of integrations with popular machine learning frameworks to create a common interface for development.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/e2e_air.png" width="70%" loading="lazy">|
|:--|
|Ray AIR enables end-to-end ML development and provides multiple options for integrating with other tools and libraries form the MLOps ecosystem.|

1. [**Ray Data**](https://docs.ray.io/en/latest/data/dataset.html)  
Scalable, framework-agnostic data loading and transformation across training, tuning, and prediction.

2. [**Ray Train**](https://docs.ray.io/en/latest/train/train.html)  
Distributed multi-node and multi-core model training with fault tolerance that integrates with popular machine learning training libraries.

3. [**Ray Tune**](https://docs.ray.io/en/latest/tune/index.html)  
Scales hyperparameter tuning to optimize model performance.

4. [**Ray Serve**](https://docs.ray.io/en/latest/serve/index.html)  
Deploys a model or ensemble of models for online inference.

5. [**Ray RLlib**](https://docs.ray.io/en/latest/rllib/index.html)  
Scales reinforcement learning workloads that integrate with other Ray AIR libraries; not covered in-depth in this module

## Part 2: End-to-end workflow with Ray AI Runtime

To illustrate Ray AIR's capabilities, you will implement an end-to-end machine learning application that predicts big tips using New York City taxi data. Each section will introduce the relevant Ray AIR library or component and demonstrating its functionality through code examples.

|Ray AIR Component|NYC Taxi Use Case|
|:--|:--|
|Ray Data|Use `Preprocessor` to load and transform input data.|
|Ray Train|Use `Trainer` to scale XGBoost model training.|
|Ray Tune|Use `Tuner` for hyperparameter search.|
|Ray AIR Predictor|Use `BatchPredictor` to load model from best checkpoint for batch inference; part of Ray Train.|
|Ray Serve|Use `PredictorDeployment` for online inference.|

For this classification task, you will apply a simple [XGBoost](https://xgboost.readthedocs.io/en/stable/) (a gradient boosted trees framework) model to the June 2021 [New York City Taxi & Limousine Commission's Trip Record Data](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page). This dataset contains over 2 million samples of yellow cab rides, and the goal is to predict whether a trip will result in a tip greater than 20% or not.

**Dataset features**
* **`passenger_count`**
    * Float (whole number) representing number of passengers.
* **`trip_distance`** 
    * Float representing trip distance in miles.
* **`fare_amount`**
    * Float representing total price including tax, tip, fees, etc.
* **`trip_duration`**
    * Integer representing seconds elapsed.
* **`hour`**
    * Hour that the trip started.
    * Integer in the range `[0, 23]`
* **`day_of_week`**
    * Integer in the range `[1, 7]`.
* **`is_big_tip`**
    * Whether the tip amount was greater than 20%.
    * Boolean `[True, False]`.

## Part 3: Ray Data
---

First, you need to load in the taxi data and transform the raw input into cleaned features that will be passed to the XGBoost model.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/data_highlight.png" width="70%" loading="lazy">|
|:--|
|Ray AIR wraps Ray Data to provide distributed data ingestion and transformation during training, tuning, and inference.|

### Introduction to Ray Datasets

Backed by PyArrow, [Ray Datasets](https://docs.ray.io/en/latest/data/user-guide.html) parallelize the loading and transforming of data and provide a standard way to pass references to data across Ray libraries and applications. Datasets are not intended to replace more general data processing systems. Instead, it serves as a last-mile bridge from ETL pipeline outputs to distributed applications and libraries in Ray.

#### Key features

- **Flexibility**

    Datasets are compatible with a variety of file formats, data sources, and distributed frameworks. They work seamlessly with library integrations like Dask on Ray and can be passed between Ray tasks and actors without copying data.

- **Performance for ML Workloads**

    Datasets offer important features like accelerator support, pipelining, and global random shuffles that accelerate ML training and inference workloads. They also support basic distributed data transformations such as map, filter, sort, groupby, and repartition.

- **Persistent Preprocessor**

    The `Preprocessor` primitive captures and stores the transformations applied to convert inputs into features. It is applied during training, tuning, batch prediction, and serving to keep the preprocessing consistent across the pipeline.
    
- **Built on Ray Core**

    Datasets inherits scalability to hundreds of nodes, efficient memory usage, object spilling, and failure handling from Ray Core. Because Datasets are just lists of object references, they can be passed between tasks and actors without needing to make a copy of the data, which is crucial for making data-intensive applications and libraries scalable.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/data_code.png" width="70%" loading="lazy">|
|:--|
|A general pattern for creating a `Dataset`, configuring a `Preprocessor`, and passing these into the `Trainer` for consistent data handling throughout the pipeline.|

With this generic structure in mind, you will see how it is applied with the tip prediction task.

### Start Ray runtime

In [None]:
import ray

In [None]:
if ray.is_initialized:
    ray.shutdown()

ray.init()

Start a Ray cluster (see: installation [instructions](https://docs.ray.io/en/latest/ray-overview/installation.html)) so that Ray can utilize all the cores available to you as workers. 

- check `ray.is_initialized` to ensure that you start with a fresh cluster
- use `ray.init()` to initialize a Ray context

### Create Ray Datasets

In [None]:
# Read Parquet file to Ray Dataset.
dataset = ray.data.read_parquet(
    "s3://anyscale-training-data/intro-to-ray-air/nyc_taxi_2021.parquet"
)

In [None]:
# Split data into training and validation subsets.
train_dataset, valid_dataset = dataset.train_test_split(test_size=0.3)

In [None]:
# Split datasets into blocks for parallel preprocessing.
# `num_blocks` should be lower than number of cores in the cluster.
train_dataset = train_dataset.repartition(num_blocks=5)
valid_dataset = valid_dataset.repartition(num_blocks=5)

**Coding exercise**

There are many [`Dataset` API elements](https://docs.ray.io/en/latest/data/api/dataset.html#) available for common transformations and operations. Try completing the following tasks:

1. Inspect [the schema](https://docs.ray.io/en/latest/data/api/dataset.html#inspecting-metadata) of the underlying Parquet metadata.
2. [Count](https://docs.ray.io/en/latest/data/api/dataset.html#inspecting-metadata) the number of rows in the training and validation datasets.
3. [Show](https://docs.ray.io/en/latest/data/api/dataset.html#consuming-datasets) the first five samples from either dataset.
4. Calculate the average `fare_amount` [grouped by](https://docs.ray.io/en/latest/data/api/dataset.html#grouped-and-global-aggregations) `passenger_count`.

In [None]:
### YOUR CODE HERE ###

**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###

print(f"Schema of training dataset: \n {train_dataset.schema()}")

In [None]:
### SAMPLE IMPLEMENTATION ###

print(f"Number of samples in training dataset: \n {train_dataset.count()}")
print(f"Number of samples in validation dataset: \n {valid_dataset.count()}")

In [None]:
### SAMPLE IMPLEMENTATION ###

train_dataset.show(5)

In [None]:
### SAMPLE IMPLEMENTATION ###

train_dataset.groupby("passenger_count").mean("fare_amount").show()

### Preprocess the dataset
To transform our raw data into features, you will define a `Preprocessor`. [Ray AIR's `Preprocessor`](https://docs.ray.io/en/latest/ray-air/package-ref.html#preprocessor) captures the data transformation you apply and persists:

- **During training**

    `Preprocessor` is passed into a `Trainer` to `fit` and `transform` input `Datasets`.
- **During tuning**

    Each `Trial` will create its own copy of the `Preprocessor`, and the fitting and transformation logic will occur once per `Trial`.
- **During checkpointing**

    The `Preprocessor` is saved in the `Checkpoint` if it was passed into the `Trainer`.
- **During predicting**

    If the `Checkpoint` contains a `Preprocessor`, then it will be used to call `transform_batch` on input batches prior to performing inference.

In [None]:
from ray.data.preprocessors import MinMaxScaler

In [None]:
# Define a preprocessor to normalize the columns by their range.
preprocessor = MinMaxScaler(columns=["trip_distance", "trip_duration"])

**Coding exercise**

Ray AIR provides several [preprocessors out of the box](https://docs.ray.io/en/latest/ray-air/preprocessors.html#) and also supports the implementation of [custom preprocessors](https://docs.ray.io/en/latest/ray-air/preprocessors.html#implementing-custom-preprocessors). Later on, you can compare model performance between the given preprocessor and your custom configuration.

Select a [built-in](https://docs.ray.io/en/latest/ray-air/preprocessors.html#types-of-preprocessors) `Preprocessor` and use `fit_transform()` to [apply it](https://docs.ray.io/en/latest/ray-air/package-ref.html#preprocessor) to the dataset. Visualize the results (perhaps using the [integration with pandas](https://docs.ray.io/en/latest/data/api/input_output.html#ray.data.from_pandas) to generate a histogram view)

Note: You may want to create a sample dataset to transform, as the original data and preprocessor will be passed to the `Trainer` in the next step for transformation.

In [None]:
### YOUR CODE HERE ###

**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###

from ray.data.preprocessors import *

sample_data = ray.data.read_parquet(
    "s3://anyscale-training-data/intro-to-ray-air/nyc_taxi_2021.parquet"
)

# create new preprocessor
sample_preprocessor = PowerTransformer(
    columns=["trip_distance", "trip_duration"], power=0.5
)

# apply the transformation
transformed_data = sample_preprocessor.fit_transform(sample_data)

In [None]:
### SAMPLE IMPLEMENTATION ###

# Visualize original data.
sample_df = sample_data.to_pandas(limit=2704905)
sample_df.hist("trip_distance")
sample_df.hist("trip_duration")

# Visualize transformed data.
transformed_df = transformed_data.to_pandas(limit=2704905)
transformed_df.hist("trip_distance")
transformed_df.hist("trip_duration")

Notice the positively-skewed distributions for `trip_distance` and `trip_duration`. For these numerical features, you can choose an appropriate AIR `Preprocessor` depending on your data's properties:

- `PowerTransformer`  
Your data isn't normal, but you need it to be.
- `Normalizer`  
You need unit norm rows.
- `MinMaxScaler`  
You are unsure of the distribution of your data.

Feature scaling can often improve performance during training, and testing choice of `Preprocessor` is worth investigating when you have a small number of features that are not already unit normalized.

### Summary

#### Key concepts

**`Dataset`**

The standard way to load and exchange data in Ray AIR. In AIR, Datasets are used extensively for data loading and transformation. They are meant as a last-mile bridge from ETL pipeline outputs to distributed applications and libraries in Ray.

**`Preprocessor`**

Preprocessors are primitives that transform input data into features. They operate on Datasets, making them scalable and compatible with a variety of datasources and dataframe libraries.

Preprocessors persist through various stages of the pipeline:

- During training to fit and transform input data
- In each trial of hyperparameter tuning
- Within a checkpoint
- On input batches for inference

AIR comes with a collection of built-in preprocessors, and you can also define your own with simple templates (see the [user guide](https://docs.ray.io/en/latest/ray-air/preprocessors.html) for more information).

## Part 4: Ray Train
***

In this section, you will set up an XGBoost model to classify taxi rides based on predicted tip amounts using the dataset and preprocessor from the previous step.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/train_highlight.png" width="70%" loading="lazy">|
|:--|
|Ray AIR wraps Ray Train to provide distributed model training.|

### Introduction to Ray Train

#### Common challenges with training

ML pracitioners tend to run into a few common problems with training models that prompt them to consider distributed solutions:

1. Training time is [too long](https://www.anyscale.com/blog/how-anastasia-implements-ray-and-anyscale-to-speed-up-ml-processes-9x) to be practical.
2. The [data is too large](https://www.anyscale.com/blog/how-ray-and-anyscale-make-it-easy-to-do-massive-scale-machine-learning-on) to fit on one machine.
3. [Training many models](https://www.anyscale.com/blog/training-one-million-machine-learning-models-in-record-time-with-ray) sequentially doesn't utilize resources efficiently.
4. The [model itself is too large](https://www.uber.com/blog/horovod-ray/) to fit on a single machine.

[Ray Train](https://docs.ray.io/en/latest/ray-air/trainer.html) addresses these issues by improving performance through distributed multi-node training.

#### Integration with Ray ecosystem

Ray Train's `Trainers` integrates well with the rest of the Ray ecosystem:

* **Ray Data**  
    * Enables scalable data loading and preprocessing with Ray `Datasets` and `Preprocessors`.
* **Ray Tune**
    * Composes with `Tuners` for distributed hyperparameter tuning.
* **Ray AIR Predictor**
    * As a checkpointed trained model to be applied during inference.
* **Popular ML training frameworks**
    * [PyTorch](https://docs.ray.io/en/latest/ray-air/package-ref.html#pytorch)
    * [Tensorflow](https://docs.ray.io/en/latest/ray-air/package-ref.html#tensorflow)
    * [Horovod](https://docs.ray.io/en/latest/ray-air/package-ref.html#horovod)
    * [XGBoost](https://docs.ray.io/en/latest/ray-air/package-ref.html#xgboost)
    * [HuggingFace Transformers](https://docs.ray.io/en/latest/ray-air/package-ref.html#huggingface)
    * [Scikit-Learn](https://docs.ray.io/en/latest/ray-air/package-ref.html#scikit-learn)
    * and more!

#### Useful features

* Callbacks for early stopping
* Checkpointing
* Integration with experiemnt tracking tools like Tensorboard, Weights & Biases, and MLFlow
* Export mechanisms for models

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/train_code.png" width="70%" loading="lazy">|
|:--|
|Define the `Trainer` object and then fit it to the training dataset. This snippet uses a `TorchTrainer`, however, this may be swapped out with any [integration](https://docs.ray.io/en/latest/ray-air/package-ref.html#trainer-and-predictor-integrations) or custom-defined `Trainer`.|

In the next section, define and fit an XGBoost Trainer to fit the NYC taxi data.

### Define AIR `Trainer`

Ray AIR provides a variety of built-in [`Trainers`](https://docs.ray.io/en/latest/ray-air/trainer.html) (PyTorch, Tensorflow, HuggingFace, etc.). In the example below, you will use a Ray `XGBoostTrainer` which [offers support](https://docs.ray.io/en/latest/train/gbdt.html) for XGBoost models.

In [None]:
from ray.air.config import ScalingConfig
from ray.train.xgboost import XGBoostTrainer

In [None]:
trainer = XGBoostTrainer(
    label_column="is_big_tip",
    num_boost_round=50,
    scaling_config=ScalingConfig(
        num_workers=5,
        use_gpu=False,
    ),
    params={
        "objective": "binary:logistic",
        "eval_metric": ["logloss", "error"],
        "tree_method": "approx",
    },
    datasets={"train": train_dataset, "valid": valid_dataset},
    preprocessor=preprocessor,
)

To construct a `Trainer`, you provide three base components:

- A `ScalingConfig` which [specifies](https://docs.ray.io/en/releases-2.0.0rc0/ray-air/config-scaling.html) how many parallel training workers and what type of resources (CPUs/GPUs) to use per worker during training; supports seamless scaling across heterogeneous hardware.
- A dictionary of training and validation sets.
- The `Preprocessor` used to transform the `Datasets`.

Optionally, you can choose to add `resume_from_checkpoint` which allows you to continue training from a [saved checkpoint](https://docs.ray.io/en/latest/ray-air/package-ref.html#ray.air.checkpoint.Checkpoint) should the run be interrupted.

### Fit the Trainer

In [None]:
# Invoke training. 
# The resulting object grants access to metrics, checkpoints, and errors.
result = trainer.fit()

**Coding exercise**

You can check out the training results from the `Result` object with the following calls:

```python
# returns last saved checkpoint
result.checkpoint

# returns the `n` best saved checkpoints as configured in `RunConfig.CheckpointConfig`
result.best_checkpoints

# returns the final metrics as reported
result.metrics

# returns an Exception if training failed
result.error
```

Inspect your training result below. What is the reported accuracy for the training and validation runs?

Note: `result.error` contains the binary classification error rate in this case calculated as `#(wrong cases)/#(all cases)`

In [None]:
### YOUR CODE HERE ###

**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###

print(f"Result metrics: \n {result.metrics} \n")

In [None]:
### SAMPLE IMPLEMENTATION ###

print(f"Training accuracy: {1 - result.metrics['train-error']:.4f}")
print(f"Validation accuracy: {1 - result.metrics['valid-error']:.4f}")

### Summary

#### Key concepts

**`Checkpoint`**

Store the full state of the model periodically, so that partially trained models are available and can be used to resume training from an intermediate point, instead of starting from scratch; also allows for the best model to be saved for batch inference later on.

**`Trainer`**

Trainers are wrapper classes around third-party training frameworks such as XGBoost, Pytorch, and Tensorflow. They are built to help integrate with Ray Actors (for distribution), Ray Datasets, and Ray Tune.

## Part 5: Ray Tune
***

Now that you have a baseline XGBoost model trained, you can try to improve performance by running hyperparameter tuning experiments.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/tune_highlight.png" width="70%" loading="lazy">|
|:--|
|Ray AIR wraps Ray Tune to provide distributed hyperparameter optimization.|

### Introduction to Ray Tune

<div class="alert alert-info">
  <strong><a href="https://en.wikipedia.org/wiki/Hyperparameter_optimization" target="_blank">Hyperparameter tuning, or optimization, (HPO)</a></strong> is the process of choosing optimal hyperparameters for a machine learning model. Hyperparameters, in contrast to weights learned by the model, are parameters that you set to influence training.
</div>


Setting up and executing hyperparameter optimization (HPO) can be expensive in terms of compute resources and runtime with several complexities including:

- **Vast Search Space**

    Your model could have several hyperparameters, each with different data types, ranges, and possible correlations.
    Sampling good candidates from high-dimensional spaces is difficult.
- **Search Algorithms**

    Choosing hyperparameters strategically requires testing complex search algorithms to achieve good results.
- **Long Runtime**

    Even if you distribute tuning, training complex models in themselves can take a long time to complete per run, so it's best to have an efficiency at every stage in the pipeline.
- **Resource Allocation**

    You must have enough compute resources available to during each trial as to not slow down search because of scheduling mismatches.
- **User Experience**

    Observability tooling for developers like stopping bad runs early, saving intermediate results, restarting from checkpoints, or pausing/resuming runs makes HPO easier.

Ray Tune is a distributed HPO library that addresses all of these topics above to provide a simplified interface for running trials and integrates with popular frameworks such as HyperOpt and Optuna.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/tune_code.png" width="70%" loading="lazy">|
|:--|
|General pattern for using AIR `Tuners` which involves taking in a `Trainer`, defining a search space, establishing a search algorithm, scheduling trials, and analyzing results.|

In the next section, you will learn about how to apply these steps to optimize the baseline XGBoost model created in the previous section.

### Use AIR `Tuner` for hyperparameter search

As an aside, Ray Tune will advance a default checkpointing system for hyperparameter tuning. For particularly large models, it may be advisable to set up a `CheckpointConfig` which defines a [checkpointing strategy](https://docs.ray.io/en/latest/ray-air/package-ref.html#ray.air.config.CheckpointConfig). In particular, you can toggle `num_to_keep` to avoid saving any fruitless trials to disk.

In [None]:
from ray import tune
from ray.tune.tuner import Tuner, TuneConfig

In [None]:
# Define a search space of hyperparameters.
param_space = {
    "params": {
        "eta": tune.uniform(0.2, 0.4), # learning rate
        "max_depth": tune.randint(1, 6), # default=6; higher value means more complex tree
        "min_child_weight": tune.uniform(0.8, 1.0), # min sum of weights of all data in a child
    }
}

tuner = Tuner(
    trainer,
    param_space=param_space,
    tune_config=TuneConfig(num_samples=3, metric="train-logloss", mode="min"),
)

To set up an AIR `Tuner`, you must specify:

- **`Trainer`**  
The training loop from before; support for heterogeneous hardware built-in with each Trainer's `ScalingConfig`.

- **`param_space`**  
A set of hyperparameters you wish to tune.

- **`TuneConfig`**  
Sets the number of experiments, metrics, and whether to minimize or maximize.

- **`search_algorithm`**  
Optimizes parameter search (optional).

- **`scheduler`**  
Stops searches early and speed up experiments (optional).

### Execute hyperparameter search and analyze results

In [None]:
# Execute tuning on `num_samples` of trials.
# You can query the ResultGrid for metrics, results, and checkpoints by trial.
result_grid = tuner.fit()

**Coding exercise**

You can probe the `ResultGrid` [for metrics](https://docs.ray.io/en/latest/tune/api_docs/result_grid.html) using these calls:

```python

# checks if there have been errors
result_grid.errors

# gets the best result
best_result = result_grid.get_best_result()

# gets the best checkpoint
best_checkpoint = best_result.checkpoint

# gets the best metrics
best_metrics = best_result.metrics

```

Inspect your tuning results. What is the best result from these experiments? Are they better than the baseline model in the training step in the previous section?

In [None]:
### YOUR CODE HERE ###

**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###

best_result = result_grid.get_best_result()
print(f"Best result: \n {best_result} \n")

In [None]:
### SAMPLE IMPLEMENTATION ###

print(f"Training accuracy: {1 - best_result.metrics['train-error']:.4f}")
print(f"Validation accuracy: {1 - best_result.metrics['valid-error']:.4f}")

### Summary

#### Key concepts

**`Tuner`**

Provides an interface that works with AIR `Trainers` to perform distributed hyperparameter tuning. You define a set of hyperparameters you wish to tune in a search space, specify a search algorithm, and the `Tuner` returns its results in a `ResultGrid` that contains metrics, results, and checkpoints for each `trial`.

## Part 6: Ray AIR Predictors
***

[Ray AIR Predictors](https://docs.ray.io/en/latest/ray-air/predictors.html) load models from [checkpoints](https://docs.ray.io/en/latest/ray-air/key-concepts.html#air-checkpoints-doc) generated during training or tuning to perform distributed inference.

`BatchPredictor` is a utility for large scale [batch inference](https://docs.ray.io/en/latest/ray-air/predictors.html#batch-prediction) that takes in a few components:

1. **`Checkpoint`**  
A saved model from training or tuning.

2. **`Preprocessor`**  
Defined earlier to transform input data for training can be reapplied to incoming batches (optional).

3. **`Predictor`**  
Loads model from checkpoint to perform inference.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/batchpredict_code.png" width="70%" loading="lazy">|
|:--|
|`Checkpoint` and `Predictor` is passed into each instance of `BatchPredictor`|

### Use AIR `BatchPredictor` for Batch Prediction

Previously, you have trained and tuned the XGBoost model on data from June 2021. You will now take the best checkpoint from tuning and perform offline, or batch, inference on taxi tip data from June 2022.

In [None]:
from ray.train.batch_predictor import BatchPredictor
from ray.train.xgboost import XGBoostPredictor

In [None]:
test_dataset = ray.data.read_parquet(
    "s3://anyscale-training-data/intro-to-ray-air/nyc_taxi_2022.parquet"
).drop_columns("is_big_tip")

test_dataset = test_dataset.repartition(num_blocks=5)

### Create `BatchPredictor` from best trial of HPO

In [None]:
# Obtain the best checkpointed result from the tuning step.
best_result = result_grid.get_best_result()

# Create a BatchPredictor from the best result and specify a Predictor class.
batch_predictor = BatchPredictor.from_checkpoint(
    checkpoint=best_result.checkpoint, predictor_cls=XGBoostPredictor
)

In [None]:
# Run batch inference.
# Prediction scales across heterogeneous hardware if specified in the ScalingConfig in the Trainer.
predicted_probabilities = batch_predictor.predict(test_dataset)

**Coding exercise**

Now that you have the predictions generated from the testing set, how did the model perform? 

Inspect the predictions outputted by `BatchPredictor` by using `predicted_probabilities.show()`.

For more practice working with Ray `Datasets`, see if you can compare the results from `predicted_probabilities` with the ground truth labels made available in the raw June 2022 Parquet file.

In [None]:
### YOUR CODE HERE ###

**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###
predicted_probabilities.show(5)

### Summary

#### Key concepts

**`BatchPredictor`**

Loads the best model from a checkpoint to perform batch inference on large-scales or online inference.

## Part 7: Ray Serve
***

Finally, you may want to serve this taxi tip prediction application to end users, hopefully with a low latency to be maximally useful to drivers on the job. This poses a challenge since machine learning models are compute intensive, and ideally, this model wouldn't be served in isolation, but rather adjacent to business logic or even other ML models.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/serve_highlight.png" width="70%" loading="lazy">|
|:--|
|Ray AIR wraps Ray Serve to provide distributed model serving.|

### Introduction to Ray Serve

Ray Serve is a scalable compute layer for serving machine learning models that enables serving individual models or creating composite model pipelines where you can independently deploy, update, and scale individual components. 

Serve isn't tied to a specific machine learning library, but rather treats models as ordinary Python code. 

Additionally, it allows you to flexibly combine normal Python business logic alongside machine learning models. This makes it possible to build online inference services completely end-to-end: 

- Validate user input.
- Query a database.
- Perform inference scalably across multiple ML models.
- Combine, filter, and validate the output all in the process of handling a single inference request.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Introduction_to_Ray_AIR/tune_code.png" width="70%" loading="lazy">|
|:--|
|Pattern for deploying a `Predictor` from a `Checkpoint` wth Ray Serve.|

### Use `PredictorDeployment` for Online Inference

### Deploy the XGBoost model from checkpoint

In [None]:
from ray import serve
from ray.serve import PredictorDeployment
from ray.serve.http_adapters import pandas_read_json

In [None]:
# Deploy the best checkpoint as a live endpoint using PredictorDeployment.
serve.run(
    PredictorDeployment.options(
        name="XGBoostService", num_replicas=2, route_prefix="/rayair"
    ).bind(XGBoostPredictor, best_result.checkpoint, http_adapter=pandas_read_json)
)

### Send some test traffic

In [None]:
import requests

In [None]:
sample_input = test_dataset.take(1)
sample_input = dict(sample_input[0])


# Send a request through HTTP.
output = requests.post("http://localhost:8000/rayair", json=[sample_input]).json()
print(output)

**Coding exercise**

You've just served a prediction for a single sample input from an item in the test dataset.

1. Tweak a custom input in the cell below to send a personalized request. 
2. Modify the code below `### YOUR CODE HERE ###` to send your request.
3. Print the output.

Note: There is much more to deployments that falls outside of the scope of this introductory module. For more, check out the [user guide](https://docs.ray.io/en/latest/ray-air/examples/serving_guide.html) for Ray Serve.

In [None]:
# Play with these inputs.
passenger_count = 1
trip_distance = 10  # in miles
fare_amount = 45
trip_duration = 1200  # in seconds
hour = 18
day_of_week = 6

sample_input = {
    "passenger_count": passenger_count,
    "trip_distance": trip_distance,
    "fare_amount": fare_amount,
    "trip_duration": trip_duration,
    "hour": hour,
    "day_of_week": day_of_week,
}

### YOUR CODE HERE ###


**Solution**

In [None]:
### SAMPLE IMPLEMENTATION ###

# play with these inputs to receive predictions
passenger_count = 1
trip_distance = 10  # in miles
fare_amount = 45
trip_duration = 1200  # in seconds
hour = 18
day_of_week = 6

sample_input = {
    "passenger_count": passenger_count,
    "trip_distance": trip_distance,
    "fare_amount": fare_amount,
    "trip_duration": trip_duration,
    "hour": hour,
    "day_of_week": day_of_week,
}

output = requests.post("http://localhost:8000/rayair", json=[sample_input]).json()
print(output)

### Summary

#### Key concepts

**`Deployments`**

A managed group of Ray actors that can be addressed together and will handle requests load-balanced across them.

### Shutdown Ray runtime

In [None]:
# Disconnect the worker, and terminate processes started by ray.init().
ray.shutdown()

## Part 8: Summary
***
You've now just created a Ray Dataset, preprocessed some features, built a model with XGBoost, searched a hyperparameter space for the best configuration, loaded the best model from a checkpoint to perform batch inference, and served that model for online inference. 

Through this end-to-end example, you explored how to use Ray AIR to distribute an entire ML pipeline.

### Key concepts

- **`Datasets`**

    The standard way to load and exchange data in Ray AIR. In AIR, Datasets are used extensively for data loading and transformation. They are meant as a last-mile bridge from ETL pipeline outputs to distributed applications and libraries in Ray.

- **`Preprocessors`**

    Preprocessors are primitives that transform input data into features. They operate on Datasets, making them scalable and compatible with a variety of datasources and dataframe libraries.

- **`Checkpoints`**

    Store the full state of the model periodically, so that partially trained models are available and can be used to resume training from an intermediate point, instead of starting from scratch; also allows for the best model to be saved for batch inference later on.

- **`Trainers`**

    Trainers are wrapper classes around third-party training frameworks such as XGBoost, Pytorch, and Tensorflow. They are built to help integrate with Ray Actors (for distribution), Ray Datasets, and Ray Tune.

- **`Tuner`**

    Provides an interface that works with AIR Trainers to perform distributed hyperparameter tuning. You define a set of hyperparameters you wish to tune in a search space, specify a search algorithm, and the Tuner returns its results in a ResultGrid that contains metrics, results, and checkpoints for each trial.

- **`BatchPredictor`**

    Loads the best model from a checkpoint to perform batch inference on large-scales or online inference.

- **`Deployments`**

    A managed group of Ray actors that can be addressed together and will handle requests load-balanced across them.
    


# Connect with the Ray community

You can learn and get more involved with the Ray community of developers and researchers:

* [**Ray documentation**](https://docs.ray.io/en/latest)

* [**Official Ray site**](https://www.ray.io/)  
Browse the ecosystem and use this site as a hub to get the information that you need to get going and building with Ray.

* [**Join the community on Slack**](https://forms.gle/9TSdDYUgxYs8SA9e8)  
Find friends to discuss your new learnings in our Slack space.

* [**Use the discussion board**](https://discuss.ray.io/)  
Ask questions, follow topics, and view announcements on this community forum.

* [**Join a meetup group**](https://www.meetup.com/Bay-Area-Ray-Meetup/)  
Tune in on meet-ups to listen to compelling talks, get to know other users, and meet the team behind Ray.

* [**Open an issue**](https://github.com/ray-project/ray/issues/new/choose)  
Ray is constantly evolving to improve developer experience. Submit feature requests, bug-reports, and get help via GitHub issues.

* [**Become a Ray contributor**](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html)  
We welcome community contributions to improve our documentation and Ray framework.

<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Generic/ray_logo.png" width="20%" loading="lazy">