Copyright (c) Microsoft Corporation. All rights reserved.  
Licensed under the MIT License.

# Want to *actually* do machine learning? 
## Part 3: Deploy models

*Made for Microsoft Build 2019*

This is the third in a series that walks through how Azure Machine Learning service can speed up your machine learning modelling workflow so you can focus on the interesting tasks that matter. 

**Goal:**
In this notebook, we'll prepare deploy our pytorch model on gpu and cpu cluster. 


In [1]:
!source activate py36


## Deploy a VM with your PyTorch model in the Cloud

### Load Azure ML workspace

We begin by instantiating a workspace object from the existing workspace created earlier in the configuration notebook.

In [2]:
from azureml.core import Workspace, Dataset
import azureml.dataprep as dprep
from azureml.core.model import Model
import pandas as pd




In [3]:
from azureml.core import Workspace

# read existing workspace from config.json
ws = Workspace.from_config()

print(ws.name, ws.location, ws.resource_group, sep = '\n')

builddemossus
southcentralus
build2019


### Get your registered model 

In [4]:
model = Model(name="pytorch_regression_remote", workspace=ws)


### Optional: Displaying your registered models

This step is not required, so feel free to skip it.

In [5]:
models = ws.models
for name, m in models.items():
    print("Name:", name,"\tVersion:", m.version, "\tDescription:", m.description, m.tags)

Name: pytorch_regression_remote 	Version: 3 	Description: None {}
Name: pytorch_regression 	Version: 3 	Description: Taxi PyTorch Regression {'onnx': 'demo'}


### Specify our Score and Environment Files

We are now going to deploy our PyTorch Model on AML with inference. We begin by writing a score.py file, which will help us run the model in our Azure ML virtual machine (VM), and then specify our environment by writing a yml file. 

### Write Score File

A score file is what tells our Azure cloud service what to do. 

In [6]:
%%writefile score.py
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import pathlib
import matplotlib.pyplot as plt
from pylab import rcParams

import pandas as pd
import numpy as np
import seaborn as sns

from collections import defaultdict

from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.metrics import mean_squared_error

pd.options.mode.chained_assignment = None

from torch.nn import init
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import torch.nn.functional as F
from torch.utils import data
from torch.optim import lr_scheduler

device = torch.device("cpu")

from tqdm import tqdm
import json

from azureml.core.model import Model


def haversine_distance(df, start_lat, end_lat, start_lng, end_lng, prefix):
    """
    calculates haversine distance between 2 sets of GPS coordinates in df
    """
    R = 6371  #radius of earth in kilometers
       
    phi1 = np.radians(df[start_lat])
    phi2 = np.radians(df[end_lat])
    
    delta_phi = np.radians(df[end_lat]-df[start_lat])
    delta_lambda = np.radians(df[end_lng]-df[start_lng])
    
        
    a = np.sin(delta_phi / 2.0) ** 2 + np.cos(phi1) * np.cos(phi2) * np.sin(delta_lambda / 2.0) ** 2
    c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
    d = (R * c) #in kilometers
    df[prefix+'distance_km'] = d

def add_datepart(df, col, prefix):
    attr = ['Year', 'Month', 'Week', 'Day', 'Dayofweek', 'Dayofyear',
            'Is_month_end', 'Is_month_start', 'Is_quarter_end', 'Is_quarter_start', 'Is_year_end', 'Is_year_start']
    attr = attr + ['Hour', 'Minute', 'Second']
    for n in attr: df[prefix + n] = getattr(df[col].dt, n.lower())
    df[prefix + 'Elapsed'] = df[col].astype(np.int64) // 10 ** 9
    df.drop(col, axis=1, inplace=True)
    
def reject_outliers(data, m = 2.):
    d = np.abs(data - np.median(data))
    mdev = np.median(d)
    s = d/(mdev if mdev else 1.)
    return s<m

def parse_gps(df, prefix):
    lat = prefix + '_latitude'
    lon = prefix + '_longitude'
    df[prefix + '_x'] = np.cos(df[lat]) * np.cos(df[lon])
    df[prefix + '_y'] = np.cos(df[lat]) * np.sin(df[lon]) 
    df[prefix + '_z'] = np.sin(df[lat])
    df.drop([lat, lon], axis=1, inplace=True)
    
def prepare_dataset(df):
    df['pickup_datetime'] = pd.to_datetime(df.pickup_datetime, infer_datetime_format=True)
    add_datepart(df, 'pickup_datetime', 'pickup')
    haversine_distance(df, 'pickup_latitude', 'dropoff_latitude', 'pickup_longitude', 'dropoff_longitude', '')
    parse_gps(df, 'pickup')
    parse_gps(df, 'dropoff')
    df.dropna(inplace=True)
    y = np.log(df.fare_amount)
    df.drop(['key', 'fare_amount'], axis=1, inplace=True)
    
    return df, y

def split_features(df):
    catf = ['pickupYear', 'pickupMonth', 'pickupWeek', 'pickupDay', 'pickupDayofweek', 
            'pickupDayofyear', 'pickupHour', 'pickupMinute', 'pickupSecond', 'pickupIs_month_end',
            'pickupIs_month_start', 'pickupIs_quarter_end', 'pickupIs_quarter_start',
            'pickupIs_year_end', 'pickupIs_year_start']

    numf = [col for col in df.columns if col not in catf]
    for c in catf: 
        df[c] = df[c].astype('category').cat.as_ordered()
        df[c] = df[c].cat.codes+1
    
    return catf, numf

def numericalize(df):
    df[name] = col.cat.codes+1

def split_dataset(df, y): return train_test_split(df, y, test_size=0.25, random_state=42)

def inv_y(y): return np.exp(y)

def get_numf_scaler(train): return preprocessing.StandardScaler().fit(train)

def scale_numf(df, num, scaler):
    cols = numf
    index = df.index
    scaled = scaler.transform(df[numf])
    scaled = pd.DataFrame(scaled, columns=cols, index=index)
    return pd.concat([scaled, df.drop(numf, axis=1)], axis=1)

class RegressionColumnarDataset(data.Dataset):
    def __init__(self, df, cats, y):
        self.dfcats = df[cats]
        self.dfconts = df.drop(cats, axis=1)
        
        self.cats = np.stack([c.values for n, c in self.dfcats.items()], axis=1).astype(np.int64)
        self.conts = np.stack([c.values for n, c in self.dfconts.items()], axis=1).astype(np.float32)
        self.y = y.values.astype(np.float32)
        
    def __len__(self): return len(self.y)

    def __getitem__(self, idx):
        return [self.cats[idx], self.conts[idx], self.y[idx]]
    
def rmse(targ, y_pred):
    return np.sqrt(mean_squared_error(inv_y(y_pred), inv_y(targ))) #.detach().numpy()

def emb_init(x):
    x = x.weight.data
    sc = 2/(x.size(1)+1)
    x.uniform_(-sc,sc)

class MixedInputModel(nn.Module):
    def __init__(self, emb_szs, n_cont, emb_drop, out_sz, szs, drops, y_range, use_bn=True):
        super().__init__()
        for i,(c,s) in enumerate(emb_szs): assert c > 1,'cardinality must be >=2, got emb_szs[{i}]: ({c},{s})'
        self.embs = nn.ModuleList([nn.Embedding(c, s) for c,s in emb_szs])
        for emb in self.embs: emb_init(emb)
        n_emb = sum(e.embedding_dim for e in self.embs)
        self.n_emb, self.n_cont=n_emb, n_cont
        
        szs = [n_emb+n_cont] + szs
        self.lins = nn.ModuleList([nn.Linear(szs[i], szs[i+1]) for i in range(len(szs)-1)])
        self.bns = nn.ModuleList([nn.BatchNorm1d(sz) for sz in szs[1:]])
        for o in self.lins: nn.init.kaiming_normal_(o.weight.data)
        self.outp = nn.Linear(szs[-1], out_sz)
        nn.init.kaiming_normal_(self.outp.weight.data)

        self.emb_drop = nn.Dropout(emb_drop)
        self.drops = nn.ModuleList([nn.Dropout(drop) for drop in drops])
        self.bn = nn.BatchNorm1d(n_cont)
        self.use_bn,self.y_range = use_bn,y_range

    def forward(self, x_cat, x_cont):
        if self.n_emb != 0:
            x = [e(x_cat[:,i]) for i,e in enumerate(self.embs)]
            x = torch.cat(x, 1)
            x = self.emb_drop(x)
        if self.n_cont != 0:
            x2 = self.bn(x_cont)
            x = torch.cat([x, x2], 1) if self.n_emb != 0 else x2
        #x = torch.ones(128,204)
        #print(x.shape)
        for l,d,b in zip(self.lins, self.drops, self.bns):
            x = F.relu(l(x))
            if self.use_bn: x = b(x)
            x = d(x)
        x = self.outp(x)
        if self.y_range:
            x = torch.sigmoid(x)
            x = x*(self.y_range[1] - self.y_range[0])
            x = x+self.y_range[0]
        return x.squeeze()

def init():
    global model
    model_path = Model.get_model_path('python_regression_remote')
    model = torch.load(model_path)
    model.eval()
    


def run(input_data):
    my_tuple=tuple([tuple(x) for x in json.loads(input_data)])
    input_data=(torch.tensor(my_tuple[0]), torch.tensor(my_tuple[1]))


    result = model(torch.tensor(my_tuple[0]), torch.tensor(my_tuple[1]))
    result= result.detach()
    result = result.numpy()
    result = float(result)
    return json.dumps(result)

Overwriting score.py


### Write Environment File

In [7]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies.create(pip_packages=["torch","numpy", "azureml-core", "azureml-dataprep","scikit-learn","seaborn","tqdm"])

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())

### Create the Container Image

This step will likely take a few minutes.

In [8]:
from azureml.core.image import ContainerImage
from azureml.core.model import Model

# added flag to include CUDA execution provider
cpu_image_config = ContainerImage.image_configuration(execution_script = "score.py",
                                                  runtime = "python",
                                                  conda_file = "myenv.yml",
                                                  description = "Taxi Regression ONNX Runtime container",
                                                  tags = {"demo": "pytorch"})

# use the ONNX Runtime CPU base image
#cpu_image_config.base_image = "mcr.microsoft.com/azureml/onnxruntime:v0.4.0"

cpu_image = ContainerImage.create(name = "pytorchimage.cpu",
                              # this is the model object
                              models = [model],
                              image_config = cpu_image_config,
                              workspace = ws)

cpu_image.wait_for_creation(show_output = True)

# OR use an existing image from your image registry
# cpu_image = ContainerImage(name = "onnximage.cpu", workspace = ws)

Creating image
Running.....................................................
SucceededImage creation operation finished for image pytorchimage.cpu:11, operation "Succeeded"


In case you need to debug your code, the next line of code accesses the log file.

In [9]:
print(cpu_image.image_build_log_uri)

https://builddemossus1844419833.blob.core.windows.net/azureml/ImageLogs/21d5c7ca-f432-4365-9ff9-8b9538b12a34/build.log?sv=2018-03-28&sr=b&sig=tOwFTHrOw84Sh%2FHKDhgxHgXaLvxWMXeqdhawesYAI6M%3D&st=2019-05-08T14%3A26%3A47Z&se=2019-06-07T14%3A31%3A47Z&sp=rl


We're all done specifying what we want our virtual machine to do. Let's configure and deploy our container image.

### Deploy the container image

In [10]:
# create the AKS service with CPU nodes
from azureml.core import Workspace
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.compute_target import ComputeTargetException
from azureml.core.webservice import Webservice, AksWebservice
from azureml.core.image import Image
from azureml.core.model import Model

cpu_aks_name = 'AKS-CPU'

cpu_aks_target = ComputeTarget(workspace = ws, name=cpu_aks_name)

try:
    cpu_aks_target = ComputeTarget(workspace = ws, name=cpu_aks_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
     # Use the  configuration (can also provide parameters to customize)
    prov_config = AksCompute.provisioning_configuration(vm_size='Standard_D3', location='East US2' )
     # Create the cluster
    cpu_aks_target = ComputeTarget.create(workspace = ws, 
                                   name = cpu_aks_name, 
                                   provisioning_configuration = prov_config)

Found existing cluster, use it.


In [11]:
%%time
cpu_aks_target.wait_for_completion(show_output = True)
print(cpu_aks_target.provisioning_state)
print(cpu_aks_target.provisioning_errors)

Long running operation information not known, unable to poll. Current state is Succeeded
Succeeded
None
CPU times: user 13.1 ms, sys: 122 µs, total: 13.2 ms
Wall time: 142 ms


In [12]:
cpu_aks_config = AksWebservice.deploy_configuration()

In [13]:
%%time
cpu_aks_service_name ='cpu-aks-service'

cpu_aks_service = Webservice.deploy_from_image(workspace = ws, 
                                           name = cpu_aks_service_name,
                                           image = cpu_image,
                                           deployment_config = cpu_aks_config,
                                           deployment_target = cpu_aks_target)
cpu_aks_service.wait_for_deployment(show_output = True)
print(cpu_aks_service.state)
print(cpu_aks_service.get_logs())

Creating service
Running........................
FailedAKS service creation operation finished, operation "Failed"
Service creation polling reached terminal state, current service state: Transitioning
{
  "code": "KubernetesDeploymentFailed",
  "statusCode": 400,
  "message": "Kubernetes Deployment failed",
  "details": [
    {
      "code": "CrashLoopBackOff",
      "message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance cpu-aks-service.\nYou can also try to run image builddemossu40880fa1.azurecr.io/pytorchimage.cpu:11 locally. Please refer to http://aka.ms/debugimage for more information."
    }
  ]
}
Transitioning
2019-05-08T14:33:57,763305853+00:00 - nginx/run 
2019-05-08T14:33:57,764145256+00:00 - rsyslog/run 
2019-05-08T14:33:57,765378160+00:00 - iot-server/run 
2019-05-08T14:33:57,768913772+00:00 - gunicorn/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are no

In [14]:
if cpu_aks_service.state != 'Healthy':
    # run this command for debugging.
    print(cpu_aks_service.get_logs())
    # If your deployment fails, make sure to delete your aks_service before trying again!
    cpu_aks_service.delete()

2019-05-08T14:33:57,763305853+00:00 - nginx/run 
2019-05-08T14:33:57,764145256+00:00 - rsyslog/run 
2019-05-08T14:33:57,765378160+00:00 - iot-server/run 
2019-05-08T14:33:57,768913772+00:00 - gunicorn/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2019-05-08T14:33:57,838730506+00:00 - iot-server/finish 1 0
2019-05-08T14:33:57,839775409+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 19.6.0
Listening at: http://127.0.0.1:9090 (13)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 45
Initializing logger
Starting up app insights client
Starting up request id generator
Starting up app insight hooks
Invoking user's init function
2019-05-08 14:34:03,408 | azureml.core.run | DEBUG | Could not load run context Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run., switching offline: False
2019-05-08 14:34:03,408 | azureml.core.run | D

### Success!

If you've made it this far, you've deployed a working VM with a PyTorch regression model running in the cloud using Azure ML. Congratulations!

Let's see how well our model deals with our test data.

## Testing and Evaluation


### Try predicting your own trips!

In [None]:
input_data = ([1, 1, 5, 1, 20, 31, 21, 1, 2],
 [-0.20771729946136475,
  -0.20766863226890564,
  0.0,
  0.0,
  -0.25793421268463135,
  0.0,
  0.9385162591934204,
  1.1996960639953613,
  0.6472086310386658,
  -0.20173433423042297,
  -0.8676630854606628,
  -0.35962310433387756,
  -0.5872217416763306,
  -0.9175888895988464]) // THIS is your input data

# input_data=json.dumps([([[ 1,  1,  5,  1, 20, 31, 21,  1,  2]]),
# ([[-0.2077173 , -0.20766863,  0.        ,  0.        , -0.2579342 ,
#           0.        ,  0.93851626,  1.1996961 ,  0.64720863, -0.20173433,
#          -0.8676631 , -0.3596231 , -0.58722174, -0.9175889 ]])])

result = service.run(input_data=json.dumps(input_data))
print(result)



# CLEANUP!

In [None]:
# remember to delete your service after you are done using it!
cpu_aks_service.delete()