# Ray on compute instance

In this notebook, we would learn how to start a local Ray cluster and interactively execute Ray script using Azure ML Compute Instance.

The user should have completed the Azure Machine Learning Tutorial: [Get started creating your first ML experiment with the Python SDK](https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-setup). 

You will need to make sure that you have a valid subscription ID, a resource group, and an Azure Machine Learning workspace.

## Install required packages

More info about installing Ray could be found [here](https://docs.ray.io/en/latest/ray-overview/installation.html).

In [None]:
%pip install --no-cache-dir \
  ../../private_wheel/azure_ai_ml-1.6.0a20230421002-py3-none-any.whl \
  'ray[default, air, tune]==2.4.0' \
  gpustat==1.0.0 \
  torch \
  torchvision

## Start a local Ray cluster

By running `ray.init`, we would start a local interactive Ray cluster within compute instance. To access the Ray dashboard from browser, we could use link with following pattern:
`https://{compute_instance_name}-{dashboard-port}.{region}.instances.azureml.ms`

In [None]:
import ray
import configparser

dashboard_port = 8265

ray_instance = ray.init(
    include_dashboard= True,
    dashboard_port=dashboard_port,
    ignore_reinit_error=True
)


# update Ray dashboard link
try:
    parser = configparser.ConfigParser()
    with open("/mnt/azmnt/.nbvm") as stream:
        parser.read_string("[config]\n" + stream.read())

    config = parser['config']
    ci_name = config['instance']
    domainsuffix = config['domainsuffix']

    dashboard_url = f'{ci_name}-{dashboard_port}.{domainsuffix}'
except:
    dashboard_url = ray_instance.dashboard_url

ray_instance.dashboard_url = dashboard_url
ray_instance

## Get Started with Model Training in Ray

After `ray` installed, we can start a local Ray cluster and train a model using Ray.

Let's use this `PyTorch` example from Ray [website](https://docs.ray.io/en/latest/train/getting-started.html)

### First, set up your dataset and model.

In [None]:
import torch
import torch.nn as nn

num_samples = 20
input_size = 10
layer_size = 15
output_size = 5

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.layer1 = nn.Linear(input_size, layer_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(layer_size, output_size)

    def forward(self, input):
        return self.layer2(self.relu(self.layer1(input)))

# In this example we use a randomly generated dataset.
input = torch.randn(num_samples, input_size)
labels = torch.randn(num_samples, output_size)

### Next, define your multi-worker `PyTorch` training function.

In [None]:
from ray import train
from torch import optim

def train_func_distributed():
    num_epochs = 3
    model = NeuralNetwork()
    model = train.torch.prepare_model(model)
    loss_fn = nn.MSELoss()
    optimizer = optim.SGD(model.parameters(), lr=0.1)

    for epoch in range(num_epochs):
        output = model(input)
        loss = loss_fn(output, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        print(f"epoch: {epoch}, loss: {loss.item()}")

### Then, train the model using `TorchTrainer`

In [None]:
from ray.train.torch import TorchTrainer
from ray.air.config import ScalingConfig

num_workers = 3

trainer = TorchTrainer(
    train_func_distributed,
    scaling_config=ScalingConfig(num_workers=num_workers)
)

results = trainer.fit()

## Shutdown local Ray cluster

In [None]:
ray.shutdown()