## A gentle 10-minute introduction to Ray AI Runitime (Ray AIR)

As part of Ray 2.0, Ray AI Runtime (AIR) is an open-source toolkit for building end-to-end simple and scalable ML applications. By leveraging Ray, its distributed compute capabilities, and its library ecosystem, Ray AIR brings scalability and programmability to ML platforms.

Ray AI Runtime focuses on two functional aspects:
 * It provides scalability by leveraging Ray’s distributed compute layer for ML workloads.
 * It is designed to interoperate with other systems for storage and metadata needs.

Ray AIR consists of five key components:

 * Data processing ([Ray Data](https://docs.ray.io/en/latest/data/dataset.html))
 * Model Training ([Ray Train](https://docs.ray.io/en/latest/train/train.html))
 * Reinforcement Learning ([Ray RLlib](https://docs.ray.io/en/latest/rllib/index.html))
 * Hyperparameter Tuning ([Ray Tune](https://docs.ray.io/en/latest/tune/index.html))
 * Model Serving ([Ray Serve](https://docs.ray.io/en/latest/serve/index.html)).
 
 <img src = "images/ai_runtime.jpeg" width="60%" height="30%">
 
📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
⬅️ [Previous notebook](./ex_07_ray_data.ipynb) <br>
 
### Learning objectives:
  * Get introduced to Ray AIR as a unified toolkit to write an end-to-end ML application in a single Python script
  * Get exposed to Ray data for data ingestion
  * Use out-of-box Preprocessors
  * Load model from the best model checkpoint and use for batch inference
  * Deploy best checkpoint model and use for online inference

In [8]:
import logging, os, random, warnings
import ray
import pandas as pd

In [9]:
warnings.filterwarnings("ignore")
os.environ["PYTHONWARNINGS"] = "ignore"

In [10]:
if ray.is_initialized:
    ray.shutdown()
ray.init(logging_level=logging.ERROR)

0,1
Python version:,3.8.13
Ray version:,3.0.0.dev0
Dashboard:,http://127.0.0.1:8265


### Create Ray data from an S3 CSV datasource

In [11]:
dataset = ray.data.read_csv("s3://anonymous@air-example-data/breast_cancer.csv")

# Split data into train and validation.
train_dataset, valid_dataset = dataset.train_test_split(test_size=0.3)
test_dataset = valid_dataset.drop_columns(["target"])

Map_Batches: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 20.61it/s]


### Create Preprocessors
This preprocessor is automatically used in the training function to `fit` and `tranform` your datasets for training and validation. You don't
have to explicitly call the preprocess before training or inference. Ray AIR toolkit automatically does that for you. 

We are going to scaler a few features like `mean radius` and `mean texture`.

In [12]:
from ray.data.preprocessors import StandardScaler

# Create a preprocessor to scale some columns
columns_to_scale = ["mean radius", "mean texture"]
preprocessor = StandardScaler(columns=columns_to_scale)

### Create Trainers
Use the Ray AIR trainer `XGBoostTrainer` with simple steps:
 1. define the parallelism for Ray compute
 2. define the XGBoost parameters for training
 3. supply the preprocessor for fitting and transforming dataset during training and validation
 4. provide the datasets for training and validation
 5. invoke `trainer.fit()` 
 
 Simple API that does a lot behind the scenes for you!

In [13]:
from ray.air.config import ScalingConfig
from ray.train.xgboost import XGBoostTrainer

trainer = XGBoostTrainer(
    scaling_config=ScalingConfig(
        # Number of workers to use for data parallelism.
        num_workers=2,
        # Whether to use GPU acceleration.
        use_gpu=False),
    label_column="target",
    num_boost_round=20,
    params={
        # XGBoost specific params
        "objective": "binary:logistic",
        "eval_metric": ["logloss", "error"],
    },
    # our train and validation dataset and preprocessor
    datasets={"train": train_dataset, "valid": valid_dataset},
    preprocessor=preprocessor,
)
result = trainer.fit()
# print(result.metrics)

Trial name,status,loc,iter,total time (s),train-logloss,train-error,valid-logloss
XGBoostTrainer_01b02_00000,TERMINATED,127.0.0.1:24646,21,5.20223,0.0184957,0,0.0893879


[2m[36m(XGBoostTrainer pid=24646)[0m 2022-07-29 15:37:22,915	INFO main.py:980 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/workers/default_worker.py", line 237, in <module>
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m     ray._private.worker.global_worker.main_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/worker.py", line 754, in main_loop
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m     self.core_worker.run_task_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/function_manager.py", line 674, in actor_method_executor
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m     return method(__ray_actor, *args, **kwargs)
[2m[36m(_RemoteRayXGBoostActor pid=24663)[0m   File "/Users/jules/git-repos/ray/python

Result for XGBoostTrainer_01b02_00000:
  date: 2022-07-29_15-37-25
  done: false
  experiment_id: ef7af65595214d21a294633db8b27cb9
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 1
  node_ip: 127.0.0.1
  pid: 24646
  time_since_restore: 4.369057893753052
  time_this_iter_s: 4.369057893753052
  time_total_s: 4.369057893753052
  timestamp: 1659134245
  timesteps_since_restore: 0
  train-error: 0.02261306532663317
  train-logloss: 0.464117960489575
  training_iteration: 1
  trial_id: 01b02_00000
  valid-error: 0.11695906432748537
  valid-logloss: 0.5025240946234318
  warmup_time: 0.0025720596313476562
  


[2m[36m(XGBoostTrainer pid=24646)[0m 2022-07-29 15:37:25,809	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 2.91 seconds (1.77 pure XGBoost training time).


Result for XGBoostTrainer_01b02_00000:
  date: 2022-07-29_15-37-26
  done: true
  experiment_id: ef7af65595214d21a294633db8b27cb9
  experiment_tag: '0'
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24646
  time_since_restore: 5.202230930328369
  time_this_iter_s: 0.7396941184997559
  time_total_s: 5.202230930328369
  timestamp: 1659134246
  timesteps_since_restore: 0
  train-error: 0.0
  train-logloss: 0.01849572773292735
  training_iteration: 21
  trial_id: 01b02_00000
  valid-error: 0.04093567251461988
  valid-logloss: 0.08938791319913073
  warmup_time: 0.0025720596313476562
  


### Create Tuner for hyperparameter search

What if you want to do hyperparameter optimization during training and use the best config for the model?
Well, you can then use Tuner and supply your training function, Trainer, as part of the argument, along 
with other Tuner configuration. 

Again, simple steps:
 1. define your hyperparameter space
 2. define `TuneConfig` for number of trials and parallelism 
 3. invoke `tuner.fit()`

In [14]:
from ray import tune

param_space = {"params": {"max_depth": tune.randint(1, 9)}}
metric = "train-logloss"

In [15]:
from ray.tune.tuner import Tuner, TuneConfig
from ray.air.config import RunConfig

tuner = Tuner(
    trainer,
    param_space=param_space,
    tune_config=TuneConfig(num_samples=5, metric=metric, mode="min"),
)
# Execute tuning.
result_grid = tuner.fit()

Trial name,status,loc,params/max_depth,iter,total time (s),train-logloss,train-error,valid-logloss
XGBoostTrainer_4efed_00000,TERMINATED,127.0.0.1:24839,7,21,3.67231,0.0184957,0,0.0893879
XGBoostTrainer_4efed_00001,TERMINATED,127.0.0.1:24847,7,21,4.73495,0.0184957,0,0.0893879
XGBoostTrainer_4efed_00002,TERMINATED,127.0.0.1:24848,5,21,4.76684,0.0184163,0,0.105782
XGBoostTrainer_4efed_00003,TERMINATED,127.0.0.1:24892,3,21,5.31764,0.0215151,0,0.0765915
XGBoostTrainer_4efed_00004,TERMINATED,127.0.0.1:24901,7,21,3.693,0.0184957,0,0.0893879


[2m[36m(XGBoostTrainer pid=24839)[0m 2022-07-29 15:39:31,104	INFO main.py:980 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/workers/default_worker.py", line 237, in <module>
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m     ray._private.worker.global_worker.main_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/worker.py", line 754, in main_loop
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m     self.core_worker.run_task_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/function_manager.py", line 674, in actor_method_executor
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m     return method(__ray_actor, *args, **kwargs)
[2m[36m(_RemoteRayXGBoostActor pid=24853)[0m   File "/Users/jules/git-repos/ray/python

Result for XGBoostTrainer_4efed_00000:
  date: 2022-07-29_15-39-34
  done: false
  experiment_id: 430e241315c44b33b220517e70d236d8
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 1
  node_ip: 127.0.0.1
  pid: 24839
  time_since_restore: 3.078613042831421
  time_this_iter_s: 3.078613042831421
  time_total_s: 3.078613042831421
  timestamp: 1659134374
  timesteps_since_restore: 0
  train-error: 0.02261306532663317
  train-logloss: 0.464117960489575
  training_iteration: 1
  trial_id: 4efed_00000
  valid-error: 0.11695906432748537
  valid-logloss: 0.5025240946234318
  warmup_time: 0.0026171207427978516
  


[2m[36m(XGBoostTrainer pid=24839)[0m 2022-07-29 15:39:34,184	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 3.09 seconds (1.94 pure XGBoost training time).


Result for XGBoostTrainer_4efed_00000:
  date: 2022-07-29_15-39-34
  done: true
  experiment_id: 430e241315c44b33b220517e70d236d8
  experiment_tag: 0_max_depth=7
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24839
  time_since_restore: 3.6723129749298096
  time_this_iter_s: 0.5321929454803467
  time_total_s: 3.6723129749298096
  timestamp: 1659134374
  timesteps_since_restore: 0
  train-error: 0.0
  train-logloss: 0.01849572773292735
  training_iteration: 21
  trial_id: 4efed_00000
  valid-error: 0.04093567251461988
  valid-logloss: 0.08938791319913073
  warmup_time: 0.0026171207427978516
  
Result for XGBoostTrainer_4efed_00002:
  date: 2022-07-29_15-39-35
  done: false
  experiment_id: f9c3675940df41e9924690d60e770365
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 1
  node_ip: 127.0.0.1
  pid: 24848
  time_since_restore: 3.1937267780303955
  time_this_iter_s: 3.1937267780303955
  time_total_s: 3.1937267780303955
  timestam

[2m[36m(XGBoostTrainer pid=24847)[0m 2022-07-29 15:39:36,375	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 4.12 seconds (2.91 pure XGBoost training time).
[2m[36m(XGBoostTrainer pid=24848)[0m 2022-07-29 15:39:36,373	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 4.07 seconds (2.79 pure XGBoost training time).


Result for XGBoostTrainer_4efed_00001:
  date: 2022-07-29_15-39-36
  done: true
  experiment_id: 5931079299c64cf797f08a4c14a7e2b6
  experiment_tag: 1_max_depth=7
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24847
  time_since_restore: 4.734950304031372
  time_this_iter_s: 0.5636541843414307
  time_total_s: 4.734950304031372
  timestamp: 1659134376
  timesteps_since_restore: 0
  train-error: 0.0
  train-logloss: 0.01849572773292735
  training_iteration: 21
  trial_id: 4efed_00001
  valid-error: 0.04093567251461988
  valid-logloss: 0.08938791319913073
  warmup_time: 0.0030601024627685547
  
Result for XGBoostTrainer_4efed_00002:
  date: 2022-07-29_15-39-37
  done: true
  experiment_id: f9c3675940df41e9924690d60e770365
  experiment_tag: 2_max_depth=5
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24848
  time_since_restore: 4.766835927963257
  time_this_iter_s: 0.649569034576416
  time_total_s: 4

[2m[36m(XGBoostTrainer pid=24892)[0m 2022-07-29 15:39:37,915	INFO main.py:980 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(XGBoostTrainer pid=24901)[0m 2022-07-29 15:39:38,374	INFO main.py:980 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(_RemoteRayXGBoostActor pid=24908)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/workers/default_worker.py", line 237, in <module>
[2m[36m(_RemoteRayXGBoostActor pid=24908)[0m     ray._private.worker.global_worker.main_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24908)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/worker.py", line 754, in main_loop
[2m[36m(_RemoteRayXGBoostActor pid=24908)[0m     self.core_worker.run_task_loop()
[2m[36m(_RemoteRayXGBoostActor pid=24908)[0m   File "/Users/jules/git-repos/ray/python/ray/_private/function_manager.py", line 674, in actor_method_executor
[2m

Result for XGBoostTrainer_4efed_00003:
  date: 2022-07-29_15-39-40
  done: false
  experiment_id: 88f7b4fbde954578b49fabf04e61d42a
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 1
  node_ip: 127.0.0.1
  pid: 24892
  time_since_restore: 4.609990119934082
  time_this_iter_s: 4.609990119934082
  time_total_s: 4.609990119934082
  timestamp: 1659134380
  timesteps_since_restore: 0
  train-error: 0.03517587939698492
  train-logloss: 0.47431553248784053
  training_iteration: 1
  trial_id: 4efed_00003
  valid-error: 0.09941520467836257
  valid-logloss: 0.5004687657830311
  warmup_time: 0.002665996551513672
  


[2m[36m(XGBoostTrainer pid=24892)[0m 2022-07-29 15:39:40,941	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 3.04 seconds (1.86 pure XGBoost training time).


Result for XGBoostTrainer_4efed_00004:
  date: 2022-07-29_15-39-41
  done: false
  experiment_id: 7b28c52716604b88b28553c85d739166
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 1
  node_ip: 127.0.0.1
  pid: 24901
  time_since_restore: 2.888662099838257
  time_this_iter_s: 2.888662099838257
  time_total_s: 2.888662099838257
  timestamp: 1659134381
  timesteps_since_restore: 0
  train-error: 0.02261306532663317
  train-logloss: 0.464117960489575
  training_iteration: 1
  trial_id: 4efed_00004
  valid-error: 0.11695906432748537
  valid-logloss: 0.5025240946234318
  warmup_time: 0.004069089889526367
  


[2m[36m(XGBoostTrainer pid=24901)[0m 2022-07-29 15:39:41,274	INFO main.py:1516 -- [RayXGBoost] Finished XGBoost training on training data with total N=398 in 2.91 seconds (1.74 pure XGBoost training time).


Result for XGBoostTrainer_4efed_00003:
  date: 2022-07-29_15-39-41
  done: true
  experiment_id: 88f7b4fbde954578b49fabf04e61d42a
  experiment_tag: 3_max_depth=3
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24892
  time_since_restore: 5.317641973495483
  time_this_iter_s: 0.6196691989898682
  time_total_s: 5.317641973495483
  timestamp: 1659134381
  timesteps_since_restore: 0
  train-error: 0.0
  train-logloss: 0.02151511543566108
  training_iteration: 21
  trial_id: 4efed_00003
  valid-error: 0.03508771929824561
  valid-logloss: 0.07659151291540056
  warmup_time: 0.002665996551513672
  
Result for XGBoostTrainer_4efed_00004:
  date: 2022-07-29_15-39-42
  done: true
  experiment_id: 7b28c52716604b88b28553c85d739166
  experiment_tag: 4_max_depth=7
  hostname: Juless-MacBook-Pro-16
  iterations_since_restore: 21
  node_ip: 127.0.0.1
  pid: 24901
  time_since_restore: 3.693004846572876
  time_this_iter_s: 0.7313690185546875
  time_total_s: 3

In [16]:
# Fetch the best result with its best hyperparameter config 
best_result = result_grid.get_best_result()
print("Best Result:", best_result)

Best Result: Result(metrics={'train-logloss': 0.01841634292981527, 'train-error': 0.0, 'valid-logloss': 0.10578184703239703, 'valid-error': 0.05263157894736842, 'done': True, 'trial_id': '4efed_00002', 'experiment_tag': '2_max_depth=5'}, error=None, log_dir=PosixPath('/Users/jules/ray_results/XGBoostTrainer_2022-07-29_15-39-29/XGBoostTrainer_4efed_00002_2_max_depth=5_2022-07-29_15-39-31'))


### Ray AIR Checkpoints

The AIR trainers, tuners, and custom pretrained model generate Checkpoints. An AIR Checkpoint is a common format for models that are used across different components of the Ray AI Runtime. This common format allow easy interoperability among AIR components and seamless integration with external supported machine learning frameworks. Read more
about [Checkpoints]().

<img src="images/checkpoints.jpeg" height="25%" and width="50%"> 

### Create a `BatchPreditor` for batch prediction
Once you have trained and tuned your model, create a batch predictor from best model using the `best_result.checkpoint` and do batch inference. 

In [17]:
from ray.train.batch_predictor import BatchPredictor
from ray.train.xgboost import XGBoostPredictor

batch_predictor = BatchPredictor.from_checkpoint(best_result.checkpoint, XGBoostPredictor)

predicted_probabilities = batch_predictor.predict(test_dataset)
print("PREDICTED PROBABILITIES")
predicted_probabilities.show()

Map Progress (1 actors 1 pending): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.39it/s]

PREDICTED PROBABILITIES
{'predictions': 0.9960426092147827}
{'predictions': 0.9957077503204346}
{'predictions': 0.0034389763604849577}
{'predictions': 0.9962536096572876}
{'predictions': 0.9968380928039551}
{'predictions': 0.9957551956176758}
{'predictions': 0.9920042157173157}
{'predictions': 0.994161069393158}
{'predictions': 0.2891101539134979}
{'predictions': 0.974367082118988}
{'predictions': 0.0034389763604849577}
{'predictions': 0.9959942102432251}
{'predictions': 0.9474029541015625}
{'predictions': 0.9923243522644043}
{'predictions': 0.9941523671150208}
{'predictions': 0.1239369809627533}
{'predictions': 0.5043733716011047}
{'predictions': 0.9935414791107178}
{'predictions': 0.9832899570465088}
{'predictions': 0.0034389763604849577}





### Create `PredictorDeployment` for Online Inference

Deploy the best model as an inference service by using Ray Serve and the `PredictorDeployment` class.

In [None]:
from ray import serve
from fastapi import Request
from ray.serve import PredictorDeployment
from ray.serve.http_adapters import json_request


async def adapter(request: Request):
    content = await request.json()
    print(content)
    return pd.DataFrame.from_dict(content)


serve.start(detached=True)
deployment = PredictorDeployment.options(name="XGBoostService", num_replicas=2, route_prefix="/rayair")

deployment.deploy(
    XGBoostPredictor, best_result.checkpoint, batching_params=False, http_adapter=adapter
)

print(deployment.url)

Started a local Ray instance. View the dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m.


After deploying the service, you can send requests to it.

In [19]:
import requests

sample_input = test_dataset.take(1)
sample_input = dict(sample_input[0])

output = requests.post(deployment.url, json=[sample_input]).json()
print(output)

[{'predictions': 0.9960426092147827}]
[2m[36m(ServeReplica:XGBoostService pid=26204)[0m [{'mean radius': 11.06, 'mean texture': 14.83, 'mean perimeter': 70.31, 'mean area': 378.2, 'mean smoothness': 0.07741, 'mean compactness': 0.04768, 'mean concavity': 0.02712, 'mean concave points': 0.007246, 'mean symmetry': 0.1535, 'mean fractal dimension': 0.06214, 'radius error': 0.1855, 'texture error': 0.6881, 'perimeter error': 1.263, 'area error': 12.98, 'smoothness error': 0.004259, 'compactness error': 0.01469, 'concavity error': 0.0194, 'concave points error': 0.004168, 'symmetry error': 0.01191, 'fractal dimension error': 0.003537, 'worst radius': 12.68, 'worst texture': 20.35, 'worst perimeter': 80.79, 'worst area': 496.7, 'worst smoothness': 0.112, 'worst compactness': 0.1879, 'worst concavity': 0.2079, 'worst concave points': 0.05556, 'worst symmetry': 0.259, 'worst fractal dimension': 0.09158}]


[2m[36m(HTTPProxyActor pid=26202)[0m INFO 2022-07-29 15:53:28,335 http_proxy 127.0.0.1 http_proxy.py:315 - POST /rayair 307 3.3ms
[2m[36m(HTTPProxyActor pid=26202)[0m INFO 2022-07-29 15:53:28,344 http_proxy 127.0.0.1 http_proxy.py:315 - POST /rayair 200 7.3ms
[2m[36m(ServeReplica:XGBoostService pid=26204)[0m INFO 2022-07-29 15:53:28,343 XGBoostService XGBoostService#rUMqHy replica.py:482 - HANDLE __call__ OK 4.4ms
[2m[36m(ServeReplica:XGBoostService pid=26205)[0m INFO 2022-07-29 15:53:28,334 XGBoostService XGBoostService#BWlSpu replica.py:482 - HANDLE __call__ OK 0.2ms


In [20]:
ray.shutdown()

### Homework

1. Have a go at Ray AIR examples in the documentation.

 📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
⬅️ [Previous notebook](./ex_07_ray_data.ipynb) <br>

Done! 🍻
 