## Part 3 - Deploy the model

In the second notebook we created a basic model and exported it to a file. In this notebook we'll use that same model file to create a REST API with Microsoft ML Server. The Ubuntu DSVM has an installation of ML Server for testing deployments. We'll create a REST API with our model and test it with the same truck image we used in notebook 2 to evaluate the model. 

There are two variables you must set before running this notebook. The first is the password for your ML Server instance. At MLADS we've already set this for you. If you're following this tutorial on your own, you should configure your ML Server instance for [one-box deployment](https://docs.microsoft.com/en-us/machine-learning-server/operationalize/configure-machine-learning-server-one-box). The second variable is the name of the deployed web service. This needs to be unique on the VM. We recommend that you use your username and a number, like *username5*.

In [None]:
# choose a unique service name. We recommend you use your username and a number, like alias3
service_name = ____SET_ME_TO_A_UNIQUE_VALUE_____

# set the ML Server admin password
ml_server_password =  ____SET_ME_____ 

## Microsoft ML Server Operationalization

ML Server Operationalization provides the ability to easily convert a model into a REST API and call it from many languages.

In [5]:
from IPython.display import Image as ShowImage
ShowImage(url="https://docs.microsoft.com/en-us/machine-learning-server/media/what-is-operationalization/data-scientist-easy-deploy.png", width=800, height=800)

ML Server runs one or more web node as the front end for REST API calls and one more compute nodes to perform the calculations for the deployed services. This VM was configured for ML Server Operationalization when it was created. Here we run a single web node and single compute node on this VM in a *one-box* configuration.

ML Server provides the azureml.deploy Python package to deploy new REST API endpoints and call them.

In [4]:
from IPython.display import Image as ShowImage
ShowImage(url="https://docs.microsoft.com/en-us/machine-learning-server/operationalize/media/configure-machine-learning-server-one-box/setup-onebox.png", width=800, height=800)

More details are available in [the ML Server documentation](https://docs.microsoft.com/en-us/machine-learning-server/operationalize/configure-machine-learning-server-one-box-9-2). 

In [None]:
from azureml.deploy import DeployClient
from azureml.deploy.server import MLServer

HOST = 'http://localhost:12800'
context = ('admin', ml_server_password)
client = DeployClient(HOST, use=MLServer, auth=context)

Retrieve the truck image for testing our deployed service.

In [None]:
from PIL import Image
import pandas as pd
import numpy as np
from matplotlib.pyplot import imshow
from IPython.display import Image as ImageShow

try: 
    from urllib.request import urlopen 
except ImportError: 
    from urllib import urlopen

url = "https://cntk.ai/jup/201/00014.png"
myimg = np.array(Image.open(urlopen(url)), dtype=np.float32)
flattened = myimg.ravel()

ImageShow(url=url, width=64, height=64)

## Deploy the model

We need two functions to deploy a model in ML Server. The *init* function handles service initialization. The *eval* function evaluates a single input value and returns the result. *eval* will be called by the server when we call the REST API.

Our *eval* function accepts a single input: a 1D numpy array with the image to evaluate. It needs to (1) reshape the input data from a 1D array to a 3D image, (2) subtract the image mean, to mimic the inputs to the model during training, (3) evaluate the model on the image, and (4) return the results as a pandas DataFrame. Alternatively we could return just the top result or the top three results.

In [None]:
import cntk

with open('model.cntk', mode='rb') as file: # b is important -> binary
    binary_model = file.read()

# --Define an `init` function to handle service initialization --
def init():
    import cntk
    global loaded_model
    loaded_model = cntk.ops.functions.load_model(binary_model)
    
# define an eval function to handle scoring
def eval(image_data):
    import numpy as np
    import cntk
    from pandas import DataFrame
    
    image_data = image_data.copy().reshape((32, 32, 3))
    
    image_mean = 133.0
    image_data -= image_mean
    image_data = np.ascontiguousarray(np.transpose(image_data, (2, 0, 1)))
    
    results = loaded_model.eval({loaded_model.arguments[0]:[image_data]})
        
    return DataFrame(results)

In [None]:
# create the API
service = client.service(service_name)\
        .version('1.0')\
        .code_fn(eval, init)\
        .inputs(image_data=np.array)\
        .outputs(results=pd.DataFrame)\
        .models(binary_model=binary_model)\
        .description('My CNTK model')\
        .deploy()
        
print(help(service))
service.capabilities()

Now call our newly created API with our truck image.

In [None]:
res = service.eval(flattened)

# -- Pluck out the named output `results` as defined during publishing and print --
print(res.output('results'))

# get the top 3 predictions
result = res.output('results')
result = result.as_matrix()[0]
top_count = 3
result_indices = (-np.array(result)).argsort()[:top_count]

label_lookup = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]
print("Top 3 predictions:")
for i in range(top_count):
    print("\tLabel: {:10s}, confidence: {:.2f}%".format(label_lookup[result_indices[i]], result[result_indices[i]] * 100))

# -- Retrieve the URL of the swagger file for this service.
cap = service.capabilities()
swagger_URL = cap['swagger']
print(swagger_URL)

In [None]:
print(service.swagger())
