# Optional: Batch inference with Ray Datasets

<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Generic/ray_logo.png" width="20%" loading="lazy">

## About this notebook

### Is this module right for you?

This module is an extension of [Scaling Batch Inference](https://github.com/ray-project/ray-educational-materials/blob/main/Computer_vision_workloads/Semantic_segmentation/Scaling_batch_inference.ipynb) and presents another approach for distributed batch inference on Ray. Through this short exploration of Ray Datasets' `map_batches` functionality, you will approach the same semantic segmentation task as before and use only Datasets to generate predictions.

This may be interesting and relevant to those who wish to either explore different use cases for Ray Data or augment their understanding of scalable batch inference on Ray.

### Prerequisites

For this notebook you should satisfy the following requirements:

* Practical Python and machine learning experience.
* Familiarity with batch inference in ML.
* Familiarity with Ray and Ray AIR equivalent to completing these training modules:
  * [Overview of Ray](https://github.com/ray-project/ray-educational-materials/blob/main/Introductory_modules/Overview_of_Ray.ipynb)
  * [Introduction to Ray AIR](https://github.com/ray-project/ray-educational-materials/blob/main/Introductory_modules/Introduction_to_Ray_AIR.ipynb)
  * [Ray Core](https://github.com/ray-project/ray-educational-materials/tree/main/Ray_Core)

Most importantly, it is highly recommended to complete the [Scaling Batch Inference](https://github.com/ray-project/ray-educational-materials/blob/main/Computer_vision_workloads/Semantic_segmentation/Scaling_batch_inference.ipynb) module prior to working through this one.

### Learning objectives

* Implement batch inference on a semantic segmentation task with Ray Datasets by using `map_batches`.
* Customize the compute resources for inference by experimenting with the `ActorPoolStrategy`.

### What will you do?

* Set up environment from "Scaling Batch Inference" module.
* Distributed batch inference with Ray Datasets.
  * Create a Ray Dataset.
  * Create a prediction class with inference logic.
  * (Optional) Specify compute by defining an actor pool strategy.
  * Use `map_batches` to apply the prediction class on batches to perform inference.
* Experiment with compute resources and observability in the coding exercise.

## Set up environment from "Scaling Batch Inference" module

This notebook provides another approach to batch inference on semantic segmentation tasks. In order to extend the solution posed in the main module, you must re-establish the context of the original example. For more context regarding these steps, refer to the root [Scaling Batch Inference](https://github.com/ray-project/ray-educational-materials/blob/main/Computer_vision_workloads/Semantic_segmentation/Scaling_batch_inference.ipynb) notebook.

In this section, you will import and load in the necessary task components:

* Set up necessary imports and utilities.
* Load label mappings.
* Load SegFormer.
* Load the feature extractor.
* Load the dataset.

In addition, you will port over some Ray-specific actions:

* Initialize Ray runtime.
* Put the model and feature extractor in the object store.

### Set up necessary imports and utilities

In [None]:
import torch
import numpy as np
import pandas as pd
from PIL import Image
from PIL.JpegImagePlugin import JpegImageFile

# Set the seed to a fixed value for reproducibility.
torch.manual_seed(201)

### Load label mappings

In [None]:
from utils import get_labels

In [None]:
id2label, label2id = get_labels()

### Load SegFormer

In [None]:
from transformers import SegformerForSemanticSegmentation

In [None]:
MODEL_NAME = "nvidia/segformer-b0-finetuned-ade-512-512"

segformer = SegformerForSemanticSegmentation.from_pretrained(
    MODEL_NAME, id2label=id2label, label2id=label2id
)

print(f"Number of model parameters: {segformer.num_parameters()/(10**6):.2f} M")

### Load the feature extractor

In [None]:
from transformers import SegformerFeatureExtractor

In [None]:
segformer_feature_extractor = SegformerFeatureExtractor.from_pretrained(
    MODEL_NAME, reduce_labels=True
)
segformer_feature_extractor

### Load dataset

In [None]:
from datasets import load_dataset
from utils import convert_image_to_rgb

In [None]:
SMALL_DATA = True

<div class="alert alert-warning">
  <strong>SMALL_DATA</strong>: default `True` - set to download only 160 images from the data set. Set to `False` (recommended) to work with full testing dataset (3352 images).
</div>

In [None]:
DATASET_NAME = "scene_parse_150"

# Load data from the Hugging Face datasets repository.
if SMALL_DATA:
    train_dataset = load_dataset(DATASET_NAME, split="train[:10]")
    test_dataset = load_dataset(DATASET_NAME, split="test[:160]")
else:
    train_dataset = load_dataset(DATASET_NAME, split="train[:10]")
    test_dataset = load_dataset(DATASET_NAME, split="test")

In [None]:
test_dataset = test_dataset.map(convert_image_to_rgb)

### Initialize Ray runtime

In [None]:
import ray

In [None]:
if ray.is_initialized:
    ray.shutdown()

ray.init()

### Put the model and feature extractor in the object store

In [None]:
segformer_ref = ray.put(segformer)
segformer_feature_extractor_ref = ray.put(segformer_feature_extractor)

## Distributed batch inference with Ray Datasets

[Ray Datasets](https://docs.ray.io/en/latest/data/dataset.html) are the standard way to load and exchange data in Ray libraries and applications. They are designed to provide distributed loading, preprocessing, and transformations such as [maps](https://docs.ray.io/en/latest/data/api/dataset.html#ray.data.Dataset.map_batches), [global and grouped aggregations](https://docs.ray.io/en/latest/data/api/grouped_dataset.html#ray.data.grouped_dataset.GroupedDataset), and [shuffling operations](https://docs.ray.io/en/latest/data/api/dataset.html#ray.data.Dataset.random_shuffle). In this bonus notebook, you will be leveraging Ray Datasets' [`map_batches`](https://docs.ray.io/en/latest/data/api/dataset.html#ray.data.Dataset.map_batches) method as a means to perform batch inference. 

The main [Scaling Batch Inference](https://github.com/ray-project/ray-educational-materials/blob/main/Computer_vision_workloads/Semantic_segmentation/Scaling_batch_inference.ipynb) module presented three architectures for performing distributed batch inference on Ray: stateless inference with Ray Tasks, stateful inference with Ray Actors, and inference with Ray AIR. In the third approach, you used `BatchPredictor`, which took in a Checkpoint (saved trained model) and a Predictor (class that defined inference logic) to generate predictions on a Ray Dataset.

`BatchPredictor` calls a Ray Datasets method, `map_batches` under the hood, so in this section, you will be peeling away a layer of abstraction and perform inference using only Ray Datasets. You will encounter the following steps:

1. Create a Ray Dataset.
2. Create a prediction class with inference logic.
3. (Optional) Specify compute by defining an actor pool strategy.
4. Use `map_batches` to apply the prediction class on batches to perform inference.

|<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Scaling_inference/ray_datasets.png" width="70%" loading="lazy">|
|:--|
|Ray Datasets parallelize data loading, preprocessing, and batching. To perform inference, Datasets are able to call `map_batches` to apply a function (the prediction logic) to batches in parallel.|

### Create a Ray Dataset with 160 images

In [None]:
from utils import get_image_indices

In [None]:
BATCH_SIZE = 16
N_BATCHES = 10

# Get BATCH_SIZE * N_BATCHES randomly shuffled image IDs from the test dataset.
image_indices = get_image_indices(dataset=test_dataset, n=BATCH_SIZE * N_BATCHES)

# Create a list of images for the indices sampled from the test dataset.
data = [test_dataset[i]["image"] for i in image_indices]

In [None]:
# Create a Ray Dataset from the list of images.
dataset = ray.data.from_items(data)
dataset.show(limit=3)

### Create a prediction class with inference logic.

In [None]:
class PredictionClass:
    # The constructor method initializes the class to load/cache the model and feature extractor.
    def __init__(
        self,
        model: SegformerForSemanticSegmentation,
        feature_extractor: SegformerFeatureExtractor,
    ):
        self.model = model
        self.feature_extractor = feature_extractor

    def __call__(self, batch: list[JpegImageFile]) -> list[np.ndarray]:

        # Set the device on which PyTorch will run.
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.model.to(device)  # Move the model to specified device.
        self.model.eval()  # Set the model in evaluation mode on test data.

        # The feature extractor processes raw images.
        inputs = self.feature_extractor(images=batch, return_tensors="pt")

        # The model is applied to input images in the inference step.
        with torch.no_grad():
            outputs = self.model(pixel_values=inputs.pixel_values.to(device))

        # Post-process the output for display.
        image_sizes = [image.size[::-1] for image in batch]
        segmentation_maps_postprocessed = (
            self.feature_extractor.post_process_semantic_segmentation(
                outputs=outputs, target_sizes=image_sizes
            )
        )

        # Return list of segmentation maps detached from the computation graph.
        return [j.detach().cpu().numpy() for j in segmentation_maps_postprocessed]

### Specify compute by defining an actor pool strategy

In [None]:
from ray.data import ActorPoolStrategy

In Ray Datasets, transformations can either be carried out by Ray Tasks or Actors. While the default compute strategy uses Ray Tasks, you can specify an `ActorPoolStrategy` which dynamically [autoscales](https://docs.ray.io/en/latest/data/transforming-datasets.html#compute-strategy) the number of actors between a  `min` and `max` size to carry out the transforms.

### Run parallel batch inference on a Ray Dataset

In [None]:
predictions_dataset = dataset.map_batches(
    PredictionClass,
    batch_size=1,
    num_gpus=0,
    num_cpus=1,
    compute=ActorPoolStrategy(min_size=1, max_size=2),
    fn_constructor_args=(segformer, segformer_feature_extractor),
)

Use the Dataset `map_batches()` [function](https://docs.ray.io/en/latest/data/api/dataset.html#ray.data.Dataset.map_batches) to apply the model to the Dataset in parallel. You can specify the batch size, any resources, as well as any autoscaling options for the actor pool.

Note: don't forget to pass `fn_constructor_args` to construct `PredictionClass`.

In [None]:
predictions_dataset.take(limit=1)

After running inference, you can inspect predictions to probe the resulting image array. Notice that the resulting predictions dataset is, itself, a Ray Dataset.

**Coding Exercise**

In this approach using Ray Datasets, you used an `ActorPoolStrategy` to set an upper and lower bound on the autoscaling of the actor pool.

A natural experiment is to try toggling the `min_size` and `max_size` of the actor pool in `map_batches` to see the effect on runtime performance.

To extend this exercise even further, open up your Ray Dashboard (linked when you called `ray.init()`) and see the dynamic autoscaling of the actor pool live.

### Summary: Distributed batch inference with Ray Datasets

#### Key API elements
* **`Datasets`**
    * These are used to parallelize data loading, preprocessing, and exchanging data in Ray AIR.

* **`map_batches`**
    * This is a function to apply a transformation and/or model class to all batches. Can be used as a way to perform batch inference using only Ray Datasets without introducing other components of Ray AIR.

# Connect with the Ray community

You can learn and get more involved with the Ray community of developers and researchers:

* [**Ray documentation**](https://docs.ray.io/en/latest)

* [**Official Ray Website**](https://www.ray.io/)  
Browse the ecosystem and use this site as a hub to get the information that you need to get going and building with Ray.

* [**Join the Community on Slack**](https://forms.gle/9TSdDYUgxYs8SA9e8)  
Find friends to discuss your new learnings in our Slack space.

* [**Use the Discussion Board**](https://discuss.ray.io/)  
Ask questions, follow topics, and view announcements on this community forum.

* [**Join a Meetup Group**](https://www.meetup.com/Bay-Area-Ray-Meetup/)  
Tune in on meet-ups to listen to compelling talks, get to know other users, and meet the team behind Ray.

* [**Open an Issue**](https://github.com/ray-project/ray/issues/new/choose)  
Ray is constantly evolving to improve developer experience. Submit feature requests, bug-reports, and get help via GitHub issues.

* [**Become a Ray contributor**](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html)  
We welcome community contributions to improve our documentation and Ray framework.

<img src="https://technical-training-assets.s3.us-west-2.amazonaws.com/Generic/ray_logo.png" width="20%" loading="lazy">