# Train and deploy a model using AzureML

Now that we have successfully trained a model and validated the predictions are as expected, we will want to prepare this for use in a production environment. To operationalize this there are many more considerations beyond just training and inferencing the model; we must consider the compute for training/retraining, distributed training, model management, containers for deployment, and other factors. 

[**Azure Machine Learning service**](https://azure.microsoft.com/en-us/services/machine-learning/) is a cloud service that you can use to develop and deploy machine learning models. Using Azure Machine Learning service, you can track your models as you build, train, deploy, and manage them, all at the broad scale that the cloud provides.

![AML Overview](./images/aml-overview.png)

## Overview of this notebook

The AzureML SDK provides a rich set of capabibities for managing your machine learning lifecycles from data prep to experimentation to monitoring and dev ops. Specifically in this notebook, we'll focus on training and deploying models. This includes:

1. **Register the workspace and set up compute resources**
2. **Train your model** - the selected compute target will be used to run the Python training script
3. **Register your model** - models registered in the registry are identified by name and version to keep track of all the models in your Azure Machine Learning workspace.
4. **Deploy your model as a web service** - the registered model, a defined scoring script, and dependent packages based on the environment configuration file, are deployed on a base container image that contains the execution environment for the model. The endpoint can be deployed on Azure Container Instances, Azure Kubernetes Service, or FPGAs, and the image has a load-balanced, HTTP endpoint that receives scoring requests that are sent to the web service.
5. **Test your service** - send requests to your web service endpoint and see your predictions in action!


## Let's Get Started

Install the Python packages from requirements.txt

In [None]:
!pip install -r requirements.txt

**Let's make sure we have the SDK installed and check the version.**

In [None]:
import azureml.core

print("You are currently using version", azureml.core.VERSION, "of the Azure ML SDK. The latest version is 1.0.74.")

## 1a. Connect to your AML Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. The workspace holds all your experiments, compute targets, models, datastores, etc.

You can open [ml.azure.com](https://ml.azure.com/) to access your workspace resources through a graphical user interface of Azure Machine Learning studio.

![AML workspace](./images/aml-workspace.png)

**You will be asked to login in the next step.** Use the credentials you used to sign in to Azure.

If you've already created a workspace, we can load it now. Import the Workspace class, and load your subscription information from the file config.json using the function from_config(). This looks for the JSON file in the current directory by default, but you can also specify a path parameter to point to the file using from_config(path="your/file/path"). **In a cloud notebook server, the file is automatically in the root directory.**

In [None]:
from azureml.core import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + workspace.name, 
      'Azure region: ' + workspace.location, 
      'Resource group: ' + workspace.resource_group, sep = '\n')

## 1b. Create a remote compute target

A [compute target](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.computetarget?view=azure-ml-py) is a designated compute resource/environment where you run your training script or host your service deployment. This location may be your local machine or a cloud-based compute resource. Compute targets can be reused across the workspace for different runs and experiments.

In this tutorial, we will use the General Purpose D3_v2 VM as your training compute resource. (more details on compute target options [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-compute-target)) This code creates a cluster for you if it does not already exist in your workspace.

Creation of the cluster takes approximately 5 minutes. If the cluster is already in your workspace this code will skip the cluster creation process.

In [None]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# choose a name for your cluster
cluster_name = "cpu-cluster"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target.')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D3_V2', 
                                                           max_nodes=6)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

compute_target.wait_for_completion(show_output=True)

# Use the 'status' property to get a detailed status for the current cluster. 
print(compute_target.status.serialize())

## 2. Train your Model

First we will create a project directory to hold all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on.

In [None]:
import os

project_folder = './pytorch-exp'
os.makedirs(project_folder, exist_ok=True)

### Training file

Now we will write the training script we went through in the first part of the workshop to a .py file - [PyTorch experiment](https://github.com/prabhat00155/onnx-odsc-tutorial/blob/master/pytorch%20experiment.ipynb)

In [None]:
%%writefile train.py
import argparse
import os
import numpy as np
import time
import torch
import torchvision
import torch.utils.data
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import (
    Variable,
)
from torch.nn import (
    init,
)
from torchvision import (
    datasets, 
    transforms,
    models,
    utils,
)

model_name = 'resnet18'
num_workers = 2
num_epochs = 2 
batch_size = 32
learning_rate = 0.01
momentum = 0.9
weight_decay = 1e-4
dropout_p = 0.4
decay_rate = 0.9999
max_grad_norm = 5.0
log_interval = 1
num_classes = 8

# reproduceability
seed = 42
torch.manual_seed(seed)

# Fetch the dataset(in rar format), unrar it after install unrar. 
os.system('wget https://www.rarlab.com/rar/rarlinux-x64-5.6.0.tar.gz')
os.system('tar -zxvf rarlinux-x64-5.6.0.tar.gz')
os.system('./rar/unrar')
os.system(
'wget http://vision.stanford.edu/lijiali/event_dataset/event_dataset.rar')
os.system('./rar/unrar x event_dataset.rar')

# Load the data, split it among train, test and validation set after applying a series of transforms.
image_folder = 'event_img/'
data_transforms = transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.47637546, 0.485785  , 0.4522678 ], [0.24692202, 0.24377407, 0.2667196 ])
    ])
data = datasets.ImageFolder(root=image_folder, transform=data_transforms)
class_names = data.classes
train_len, val_len = int(len(data) * 0.75), int(len(data) * 0.2)
test_len = len(data) - train_len - val_len
train_set, val_set, test_set = torch.utils.data.random_split(data, [train_len, val_len, test_len])
loader = {
    'train': torch.utils.data.DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=num_workers),
    'test': torch.utils.data.DataLoader(test_set, batch_size=batch_size, shuffle=True, num_workers=num_workers),
    'val': torch.utils.data.DataLoader(val_set, batch_size=batch_size, shuffle=True, num_workers=num_workers)
}

# Pick one of the pre-trained models, replace its final layer setting its output to the number of classes.
model = models.__dict__[model_name](pretrained=True) # Set false to train from scratch
# Alter the final layer
final_layer_input = model.fc.in_features
# nn.Linear a linear transformation to the incoming data: y = x A^T + b
model.fc = nn.Linear(final_layer_input, num_classes)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), 
                      lr=learning_rate, 
                      momentum=momentum, 
                      weight_decay=weight_decay,
                     )


def process_batch(inputs, targets, model, criterion, optimizer, is_training):
    """
    Process a minibatch for loss and accuracy.
    """
    
    # Convert tensors to Variables (for autograd)
    if is_training:
        X_batch = Variable(inputs, requires_grad=False)
    else:
        X_batch = Variable(inputs, volatile=True, requires_grad=False)
    y_batch = Variable(targets.long(), requires_grad=False)

    # Forward pass
    scores = model(X_batch) # logits

    # Loss
    loss = criterion(scores, y_batch)
    
    # Accuracy
    score, predicted = torch.max(scores, 1)
    accuracy = (y_batch.data == predicted.data).sum() / float(len(y_batch))
    
    if is_training:

        # In PyTorch, we need to set the gradients to zero before starting to
        # do backpropragation because PyTorch accumulates the gradients on subsequent
        # backward passes.
        optimizer.zero_grad()
        loss.backward()
        
        # Clip the gradient norms
        nn.utils.clip_grad_norm(model.parameters(), max_grad_norm)

        # Update params
        optimizer.step()

    return loss, accuracy


def train(model, criterion, optimizer, train_loader, test_loader, 
          num_epochs, batch_size, log_interval, learning_rate,
          dropout_p, decay_rate, max_grad_norm):
    """
    Training the model.
    """
    
    # Metrics
    train_loss, train_acc = [], []
    test_loss, test_acc = [], []

    # Training
    for num_train_epoch in range(num_epochs):
        # Timer
        start = time.time()

        # Decay learning rate
        learning_rate = learning_rate * (decay_rate ** (num_train_epoch // 1.0))
        for param_group in optimizer.param_groups:
            param_group['lr'] = learning_rate

        # Metrics
        train_batch_loss, train_batch_accuracy = 0.0, 0.0

        for train_batch_num, (inputs, target) in enumerate(train_loader):
            # Get metrics
            model.train()
            loss, accuracy = process_batch(
                inputs, target, model, criterion, optimizer, model.training)
            
            # Add to batch scalars
            train_batch_loss += loss.data.item() / float(len(inputs))
            train_batch_accuracy += accuracy
            
        # Add to global metrics
        train_loss.append(train_batch_loss / float(train_batch_num+1))
        train_acc.append(train_batch_accuracy / float(train_batch_num+1))

        # Testing
        model.eval()
        for num_test_epoch in range(1):
            # Metrics
            test_batch_loss, test_batch_accuracy = 0.0, 0.0

            for test_batch_num, (inputs, target) in enumerate(test_loader):
                # Get metrics
                model.eval()
                loss, accuracy = process_batch(
                    inputs, target, model, criterion, optimizer, model.training)
                # Add to batch scalars
                test_batch_loss += loss.data.item() / float(len(inputs))
                test_batch_accuracy += accuracy

            # Add to global metrics
            test_loss.append(test_batch_loss / float(test_batch_num+1))
            test_acc.append(test_batch_accuracy / float(test_batch_num+1))
                

            verbose_condition = ((num_train_epoch == 0) or (num_train_epoch % log_interval == 0) 
                                 or (num_train_epoch == num_epochs-1))

            # Verbose
            if verbose_condition:
                time_remain = (time.time() - start) * (num_epochs - (num_train_epoch + 1))
                minutes = time_remain // 60
                seconds = time_remain - minutes * 60
                print(f'TIME REMAINING: {minutes:.0f}m {seconds:.0f}s')
                print(f'[EPOCH]: {num_train_epoch},'
                      f'[TRAIN LOSS]: {train_batch_loss / float(train_batch_num+1):.6f},'
                      f'[TRAIN ACC]: {train_batch_accuracy / float(train_batch_num+1):.3f},'
                      f'[VAL LOSS]: {test_batch_loss / float(test_batch_num+1):.6f},'
                      f'[VAL ACC]: {test_batch_accuracy / float(test_batch_num+1):.3f}')
    return model


model = train(model, criterion, optimizer, loader['train'], loader['val'], 
              num_epochs, batch_size, log_interval, learning_rate,
              dropout_p, decay_rate, max_grad_norm)

parser = argparse.ArgumentParser(
    description='PyTorch Sports Image Classification')
parser.add_argument('--output-dir', type=str, default='outputs')
args = parser.parse_args()
output_dir = args.output_dir
os.makedirs(output_dir, exist_ok=True)
dummy_input = torch.randn(1, 3, 224, 224)
model_path = os.path.join(output_dir, 'sports_classification-1.onnx')
torch.onnx.export(model, dummy_input, model_path)

In [None]:
%%bash
mv train.py pytorch-exp

### Create an experiment
Create an Experiment to track all the runs in your workspace for this transfer learning PyTorch tutorial.

In [None]:
from azureml.core import Experiment

experiment_name = 'pytorch1-sports'
experiment = Experiment(ws, name=experiment_name)

### Create a PyTorch estimator
The AML SDK's PyTorch estimator enables you to easily submit PyTorch training jobs for both single-node and distributed runs. For more information on the PyTorch estimator, see [this page](https://docs.microsoft.com/en-gb/azure/machine-learning/service/how-to-train-pytorch). The following code will define a single-node PyTorch job.

In [None]:
from azureml.train.dnn import PyTorch

estimator = PyTorch(source_directory=project_folder, 
                    script_params={'--output-dir': './outputs'},
                    compute_target=compute_target,
                    entry_script='train.py',
                    use_gpu=False)

# upgrade to the latest version of PyTorch, which has better support for ONNX
estimator.conda_dependencies.remove_conda_package('pytorch=0.4.0')
estimator.conda_dependencies.add_conda_package('pytorch')
estimator.conda_dependencies.add_channel('pytorch')

The script_params parameter is a dictionary containing the command-line arguments to your training script entry_script. Please note the following:
We specified the output directory as ./outputs. The outputs directory is specially treated by AML in that all the content in this directory gets uploaded to your workspace as part of your run history. The files written to this directory are therefore accessible even once your remote run is over. In this tutorial, we will save our trained model to this output directory.

### Submit job
Run your experiment by submitting your estimator object. Note that this call is asynchronous.

In [None]:
run = experiment.submit(estimator)
print(run.get_details())

### Monitor your run
You can monitor the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous and provides live updates every 10-15 seconds until the job completes.

In [None]:
from azureml.widgets import RunDetails
RunDetails(run).show()

Alternatively, you can block until the script has completed training before running more code.

In [None]:
%%time
run.wait_for_completion(show_output=True)

### Download the model (optional)
Once the run completes, you can choose to download the ONNX model.

In [None]:
# list all the files from the run
run.get_file_names()

In [None]:
model_path = os.path.join('outputs', 'sports_classification-1.onnx')
run.download_file(model_path, output_file_path=model_path)

You can also view the ONNX model using [Netron](https://lutzroeder.github.io/netron/).

## 3. Register your trained model
To keep track of our models from various runs we may be testing, we will register the model from the run to our workspace. The model_path parameter takes in the relative path on the remote VM to the model file in your outputs directory. You can then deploy this registered model as a web service through the AML SDK.

In [None]:
model = run.register_model(model_name='sports_classification-1', model_path=model_path)
print(model.name, model.id, model.version, sep = '\t')

#### Displaying your registered models (optional)
To see all the models you've registered, you can list them as shown below.

In [None]:
models = ws.models
for name, m in models.items():
    print("Name:", name,"\tVersion:", m.version, "\tDescription:", m.description, m.tags)

## 4. Deploying your model as a web service
Now we are ready to [deploy](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where) the model as a web service. For this notebook we will deploy this to run on an Azure Container Instance [ACI](https://azure.microsoft.com/en-us/services/container-instances/), but you can alternatively also run on your [local](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#local) machine or with Azure Kubernetes Service [AKS](https://azure.microsoft.com/en-us/services/kubernetes-service/). 

Azure Machine Learning accomplishes this by constructing a Docker image with the scoring logic and model baked in. We will deploy our ONNX model on Azure ML using ONNX Runtime inference engine. 

**To build the correct environment, provide the following:**

* A scoring script to show how to use the model
* An environment file to show what packages need to be installed
* A configuration file to build the web service
* The model you trained before

### Write scoring file
We begin by writing a score.py file that will be invoked by the web service call.

Note that the scoring script must have two required functions, init() and run(input_data).
* The **init()** function is called once when the container is started so we load the model using the ONNX Runtime into a global session object. This function is executed only once when the Docker container is started.
* In **run(input_data)** function, the model is used to predict a value based on the input data. The input and output to run typically use JSON as serialization and de-serialization format but you are not limited to that.

In [None]:
%%writefile score.py
import json
import time
import sys
import os
from PIL import Image
import requests
from io import BytesIO
from azureml.core.model import Model
import numpy as np
from onnxruntime import InferenceSession
from torchvision import transforms

def init():
    global session
    # AZUREML_MODEL_DIR is an environment variable created during deployment.
    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)
    # For multiple models, it points to the folder containing all deployed models (./azureml-models)
    model_onnx = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sports_classification-1.onnx')
    session = InferenceSession(model_onnx)

def preprocess(input_data_json):
    input_url = json.loads(input_data_json)['data'][0]
    # convert the image url into the tensor input
    data_transforms = transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.47637546, 0.485785  , 0.4522678 ], [0.24692202, 0.24377407, 0.2667196 ])
    ])  
    response = requests.get(input_url)
    image = Image.open(BytesIO(response.content))
    image = data_transforms(image)
    image = image.numpy().reshape((1, *image.shape))
    return image

def postprocess(result):
    class_names = [ 
        'RockClimbing',
        'badminton',
        'bocce',
        'croquet',
        'polo',
        'rowing',
        'sailing',
        'snowboarding'
    ]   
    return class_names[np.argmax(result[0])]

def run(input_data_json):
    try:
        start = time.time()   # start timer
        input_data = preprocess(input_data_json)
        input_name = session.get_inputs()[0].name  # get the id of the first input of the model   
        result = session.run(None, {input_name: input_data})
        end = time.time()     # stop timer
        return {"result": postprocess(result),
                "time": end - start}
    except Exception as e:
        result = str(e)
        return {"error": result}


### Set the evironment and inference configurations
First we create a YAML file that specifies which dependencies we would like to see in our container.


In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies.create(pip_packages=["numpy","onnxruntime","azureml-core", "Pillow", "torchvision"])

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())

Then we setup the [inference configuration](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py).

In [None]:
from azureml.core.model import InferenceConfig

inference_config = InferenceConfig(runtime= "python", 
                                   entry_script="score.py",
                                   conda_file="myenv.yml",
                                   extra_docker_file_steps = "Dockerfile")

### Deploy the model using [Azure Container Instances](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview)
**Estimated time to complete: about 3-7 minutes**

Configure the image and deploy.

In [None]:
from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, 
                                               memory_gb = 1, 
                                               tags = {'demo': 'onnx'}, 
                                               description = 'web service for sports classification ONNX model')

The following code goes through these steps:

Build an image using:
* The scoring file (score.py)
* The environment file (myenv.yml)
* The model file
* Define ACI Deployment Configuration
* Send the image to the ACI container.
* Start up a container in ACI using the image.
* Get the web service HTTP endpoint.

In [None]:
from azureml.core.model import Model
from random import randint

aci_service_name = f'onnx-sports{randint(0,100)}'
print("Service", aci_service_name)
aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)

In case the deployment fails, you can check the logs. Make sure to delete your aci_service before trying again.

In [None]:
if aci_service.state != 'Healthy':
    # run this command for debugging.
    print(aci_service.get_logs())
    aci_service.delete()

### Success!
If you've made it this far, you've deployed a working web service that does sports image classification using an ONNX model. You can get the URL for the webservice with the code below.

In [None]:
service_url = aci_service.scoring_uri
service_url

## 5. Test the service
To submit sample data to the running service, use the following code.

In [None]:
from IPython.display import Image

image_name = 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/30/DN_ice_boat--Ice_Nine--Lake_Sunapee_NH.jpg/220px-DN_ice_boat--Ice_Nine--Lake_Sunapee_NH.jpg'
Image(url=image_name)

In [None]:
import requests
import json
test_sample = json.dumps({'data': [
    image_name
]})
test_sample = bytes(test_sample,encoding = 'utf8')
headers = {'Content-Type':'application/json'}
resp = requests.post(service_url, test_sample, headers=headers)
print("prediction:", resp.text)