# Serving PyTorch Models with CMLE  Custom Prediction Code

Cloud ML Engine Online Prediction now supports custom python code in two forms:

1. Custom transforms in scikit-learn pipelines.
2. Custom prediction routine, including custom pre/post processing, and/or models not created by the scikit-learn framework.

In this notebook, we show how to deploy a model created by [PyTorch](https://pytorch.org/) using CMLE  Custom Prediction Code

**Note**: You must be whitelisted to use the custom code feature. Please fill out [this google form](https://docs.google.com/forms/d/e/1FAIpQLSc6fxgXQIyA6BDLfCKOJPu5CyCuOB_M_rGTws0629od5mlznw/viewform) to get started.

## Setup

Before we start let's install gcloud tool so we can interact with Google Cloud Machine Learning Engine easier:

In [1]:
!pip install -U google-cloud

Requirement already up-to-date: google-cloud in /Users/khalidsalama/Technology/python-venvs/py27-venv/lib/python2.7/site-packages (0.34.0)


Let's also define the project name, model name, the gcs bucket name that we'll refer to later:

In [2]:
PROJECT='ksalama-gcp-playground' 
BUCKET='ksalama-gcs-cloudml'
REGION='europe-west1'

!gcloud config set project {PROJECT}
!gcloud config get-value project

Updated property [core/project].
ksalama-gcp-playground


## Download iris data
In this example, we want to build a classifier for the simple [iris dataset](https://archive.ics.uci.edu/ml/datasets/iris). So first, we download the data csv file locally.

In [3]:
!mkdir data
!mkdir models

mkdir: data: File exists
mkdir: models: File exists


In [4]:
import urllib

LOCAL_DATA_DIR = "data/iris.csv"

url_opener = urllib.URLopener()
url_opener.retrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", LOCAL_DATA_DIR)

('data/iris.csv', <httplib.HTTPMessage instance at 0x1080925a8>)

# A. Build a PyTorch NN Classifier

Make sure that pytorch package is [installed](https://pytorch.org/get-started/locally/).

In [5]:
import torch
from torch.autograd import Variable

print 'PyTorch Version: {}'.format(torch.__version__)

PyTorch Version: 0.4.1


## 1. Load Data to Pandas Dataframes

In [6]:
import pandas as pd

datatrain = pd.read_csv(LOCAL_DATA_DIR, names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'])

#change string value to numeric
datatrain.loc[datatrain['species']=='Iris-setosa', 'species']=0
datatrain.loc[datatrain['species']=='Iris-versicolor', 'species']=1
datatrain.loc[datatrain['species']=='Iris-virginica', 'species']=2
datatrain = datatrain.apply(pd.to_numeric)

#change dataframe to array
datatrain_array = datatrain.as_matrix()

#split x and y (feature and target)
xtrain = datatrain_array[:,:4]
ytrain = datatrain_array[:,4]

print 'Records loaded: {}'.format(len(xtrain))

Records loaded: 150


## 2. Set model parameters

In [7]:
input_features = 4
hidden_units = 10
num_classes = 3
learning_rate = 0.1
momentum = 0.9
num_epoch = 10000

## 3. Define the PyTorch NN model

In [11]:
model = torch.nn.Sequential(
    torch.nn.Linear(input_features, hidden_units),
    torch.nn.Sigmoid(),
    torch.nn.Linear(hidden_units, num_classes),
    torch.nn.Softmax()
)

loss_metric = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(),lr=learning_rate, momentum=momentum)

## 4. Train the model

In [12]:
for epoch in range(num_epoch):
    
    x = Variable(torch.Tensor(xtrain).float())
    y = Variable(torch.Tensor(ytrain).long())

    optimizer.zero_grad()
    
    y_pred = model(x)
    loss = loss_metric(y_pred, y)

    loss.backward()
    optimizer.step()

    if (epoch) % 1000 == 0:
        print 'Epoch [{}/{}] Loss: {}'.format(epoch+1, num_epoch, round(loss.item(),3))
        
print 'Epoch [{}/{}] Loss: {}'.format(epoch+1, num_epoch, round(loss.item(),3))

Epoch [1/10000] Loss: 1.107
Epoch [1001/10000] Loss: 0.578
Epoch [2001/10000] Loss: 0.573
Epoch [3001/10000] Loss: 0.571
Epoch [4001/10000] Loss: 0.57
Epoch [5001/10000] Loss: 0.569
Epoch [6001/10000] Loss: 0.568
Epoch [7001/10000] Loss: 0.568
Epoch [8001/10000] Loss: 0.567
Epoch [9001/10000] Loss: 0.567
Epoch [10000/10000] Loss: 0.567


## 5. Save and load the model

In [14]:
LOCAL_MODEL_DIR = "models/model.pt"

torch.save(model, LOCAL_MODEL_DIR)
iris_classifier = torch.load(LOCAL_MODEL_DIR)

## 6. Test the loaded model for predictions

In [15]:
def predict_class(instances, vocab):
    instances = torch.Tensor(instances)
    output = iris_classifier(instances)
    _ , predicted = torch.max(output, 1)
    return [vocab[class_index] for class_index in predicted]

Get predictions for the first 10 instances in the dataset

In [16]:
print predict_class(xtrain[0:10], ['setosa', 'versicolor', 'virginica'])

['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa']


## 7. Upload trained model to Cloud Storage

In [18]:
GCS_MODEL_DIR='models/pytorch/iris_classifier/'

!gsutil -m cp -r {LOCAL_MODEL_DIR} gs://{BUCKET}/{GCS_MODEL_DIR}
!gsutil ls gs://{BUCKET}/{GCS_MODEL_DIR}

Copying file://models/model.pt [Content-Type=application/octet-stream]...
/ [1/1 files][  8.2 KiB/  8.2 KiB] 100% Done                                    
Operation completed over 1 objects/8.2 KiB.                                      
gs://ksalama-gcs-cloudml/models/pytorch/iris_classifier/model.pt


# B. Prepare the Custom Prediction Package

1. Implement a model **custom class** for pre/post processing, as well as loading and using your model for prediction.
2. Prepare yout **setup.py** file, to include all the modules and packages you need in your custome model class.

### 1. Create the custom model class
In the **from_path**, you load the pytorch model that you uploaded to GCS. Then in the **predict** method, you use it for prediction.

In [19]:
%%writefile model.py

import os
import pandas as pd
from google.cloud import storage
import torch

class PyTorchIrisClassifier(object):
    
    def __init__(self, model):
        self._model = model
        self.class_vocab = ['setosa', 'versicolor', 'virginica']
        
    @classmethod
    def from_path(cls, model_dir):
        model_file = os.path.join(model_dir,'model.pt')
        model = torch.load(model_file)    
        return cls(model)

    def predict(self, instances, **kwargs):
        data = pd.DataFrame(instances).as_matrix()
        inputs = torch.Tensor(data)
        outputs = self._model(inputs)
        _ , predicted = torch.max(outputs, 1)
        return [self.class_vocab[class_index] for class_index in predicted]

Overwriting model.py


## 2. Create a setup.py module
Include **pytorch** as a required package, as well as the **model.py** file that includes your custom model class.

In [20]:
%%writefile setup.py

from setuptools import setup

REQUIRED_PACKAGES = ['torch']

setup(
    name="iris-custom-model",
    version="0.1",
    scripts=["model.py"],
    install_requires=REQUIRED_PACKAGES
)

Overwriting setup.py


## 3. Create the package 

This will create a .tar.gz package under /dist directory. The name of the package will be (name)-(version).tar.gz where (name) and (version) are the ones specified in the setup.py.

In [21]:
!python setup.py sdist

running sdist
running egg_info
writing requirements to iris_custom_model.egg-info/requires.txt
writing iris_custom_model.egg-info/PKG-INFO
writing top-level names to iris_custom_model.egg-info/top_level.txt
writing dependency_links to iris_custom_model.egg-info/dependency_links.txt
reading manifest file 'iris_custom_model.egg-info/SOURCES.txt'
writing manifest file 'iris_custom_model.egg-info/SOURCES.txt'

running check


creating iris-custom-model-0.1
creating iris-custom-model-0.1/iris_custom_model.egg-info
copying files to iris-custom-model-0.1...
copying model.py -> iris-custom-model-0.1
copying setup.py -> iris-custom-model-0.1
copying iris_custom_model.egg-info/PKG-INFO -> iris-custom-model-0.1/iris_custom_model.egg-info
copying iris_custom_model.egg-info/SOURCES.txt -> iris-custom-model-0.1/iris_custom_model.egg-info
copying iris_custom_model.egg-info/dependency_links.txt -> iris-custom-model-0.1/iris_custom_model.egg-info
copying iris_custom_model.egg-info/requires.txt -> iris-

## 4. Uploaded the package to GCS

In [29]:
GCS_PACKAGE_URI='models/pytorch/packages/iris-custom-model-0.1.tar.gz'

!gsutil cp ./dist/iris-custom-model-0.1.tar.gz gs://{BUCKET}/{GCS_PACKAGE_URI}
!gsutil ls gs://{BUCKET}/{GCS_PACKAGE_DIR}

Copying file://./dist/iris-custom-model-0.1.tar.gz [Content-Type=application/x-tar]...
/ [1 files][  1.0 KiB/  1.0 KiB]                                                
Operation completed over 1 objects/1.0 KiB.                                      
gs://ksalama-gcs-cloudml/models/pytorch/packages/iris-custom-model-0.1.tar.gz


# C. Deploy the Model to CMLE for Online Predictions

## 1. Create CMLE model

In [27]:
MODEL_NAME='torch_iris_classifier'

!gcloud ml-engine models create {MODEL_NAME} --regions {REGION}
!echo ''
!gcloud ml-engine models list | grep 'torch'

Created ml engine model [projects/ksalama-gcp-playground/models/torch_iris_classifier].

torch_iris_classifier


## 2. Create CMLE model version

Once you have your custom package ready, you can specify this as an argument when creating a version resource. Note that you need to provide the path to your package (as package-uris) and also the class name that contains your custom predict method (as model-class).

In [None]:
MODEL_VERSION='v1'
RUNTIME_VERSION='1.10'
MODEL_CLASS='model.PyTorchIrisClassifier'

!gcloud alpha ml-engine versions create {MODEL_VERSION} --model={MODEL_NAME} \
            --origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
            --runtime-version={RUNTIME_VERSION} \
            --framework='SCIKIT_LEARN' \
            --python-version=2.7 \
            --package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI}\
            --model-class={MODEL_CLASS}

In [30]:
!gcloud ml-engine versions list --model {MODEL_NAME}

NAME  DEPLOYMENT_URI                                           STATE
v1    gs://ksalama-gcs-cloudml/models/pytorch/iris_classifier  READY


# D. CMLE Online Prediction

In [31]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
                      discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json')


def estimate(project, model_name, version, instances):
    
    request_data = {'instances': instances}

    model_url = 'projects/{}/models/{}/versions/{}'.format(project, model_name, version)
    response = api.projects().predict(body=request_data, name=model_url).execute()

    #print response
    
    predictions = response["predictions"]
    return predictions

In [33]:
instances = [
    [6.8, 2.8, 4.8, 1.4],
    [6. , 3.4, 4.5, 1.6]
]

predictions = estimate(instances=instances
                     ,project=PROJECT
                     ,model_name=MODEL_NAME
                     ,version=VERSION_NAME)

print(predictions)

NameError: name 'VERSION_NAME' is not defined

# Questions? Feedback?
Feel free to send us an email (cloudml-feedback@google.com) if you run into any issues or have any questions/feedback!