# 3 - Deploying the Model

The final step in this module will be to deploy that we created in the previous clip and make it available to call via either the SDK or through a REST endpoint.  This will allow us to operationalize this model within our organization.

In [None]:
%matplotlib inline
import numpy as np
import os
import matplotlib.pyplot as plt
 
import azureml.core
print("Azure ML SDK Version: ", azureml.core.VERSION)

## Validating our Model Locally

In the previous notebook we trained our model, and then we registered it within our workspace.  We now need to make sure we can pull this model down to our notebook server and use it there.  This is a critical step to ensure that we diagnose any problems here before we deploy this.

In [None]:
from azureml.core import Workspace
from azureml.core.model import Model
import tensorflow as tf
from tensorflow.keras.models import load_model

# Get a reference to our workspace
ws = Workspace.from_config()

# Get a reference to our model
amlModel=Model(ws, 'keras-mnist')

# Download the model to our local notebook server
amlModel.download(target_dir=os.getcwd(), exist_ok=True)

# Make sure that the file downloaded
file_path = os.path.join(os.getcwd(), "mnist.h5")
os.stat(file_path)

# Use Keras to load this model
model = load_model('mnist.h5')

Once we have our model downloaded locally, we can now verify that we can leverage it for inference:

In [None]:
from utils import load_data

# DATA FOLDER
# Make sure we have the data folder created locally
data_folder = os.path.join(os.getcwd(), 'data')
os.makedirs(data_folder, exist_ok=True)

# LOAD DATA
# We use a slightly modified version of the logic we have for loading in our data
num_classes = 10

training_images = load_data(data_folder, "train-images-idx3-ubyte.gz", False) / 255.0
training_images = np.reshape(training_images, (-1, 28,28)).astype('float32')
test_images = load_data(data_folder, "t10k-images-idx3-ubyte.gz", False) / 255.0
test_images = np.reshape(test_images, (-1, 28,28)).astype('float32')

training_labels = load_data(data_folder, "train-labels-idx1-ubyte.gz", True).reshape(-1)
test_labels = load_data(data_folder, "t10k-labels-idx1-ubyte.gz", True).reshape(-1)

print(f'Training Image: {training_images.shape}')
print(f'Training Labels: {training_labels.shape}')
print(f'Test Images: {test_images.shape}')
print(f'Test Labels: {test_labels.shape}')

# CLASS NAMES
# Get the Class Names from the labels
class_names = np.unique(training_labels)

# PREDICT
# Provide the array of predictions for the passed in image
def predict_image(image):
    image_test = (np.expand_dims(image,0))
    return model.predict(image_test)[0]

# VISUALIZE
# Function to predict an image based on the model and visualize predictions
def visualize_image_prediction(image):
    predictions = predict_image(image)
    fig = plt.figure(figsize=(18, 5))
    grid = plt.GridSpec(1, 3, wspace=0.4, hspace=0.3)
    plt.subplot(grid[0, 0])
    plt.imshow(image, cmap='gray_r')
    plt.subplot(grid[0, 1:])
    plt.bar(range(10), predictions, color="#f05a28")
    plt.xticks(range(10), class_names)

Now that we have this logic in place, we can test against 5 of the training images:

In [None]:
for i in range(5):
    visualize_image_prediction(training_images[i+1])

## Deploy Model as Web Service

Now that we have validated that the model is working locally, we can transition to deploying the model as a web service.

### Configuring the Web Service

The first step is to create a script that will be called when the service is executed.  We will call this script `score.py`.

There are two functions that we must implement in this file:

* `init` - this function will handle the loading of the model
* `run` - this will handle running inference against the data that is passed in with each call

You can see an implementation of this below:

In [None]:
%%writefile score.py
import json
import numpy as np
import os
from tensorflow.keras.models import load_model

# Initialize the model
def init():
    global model
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'mnist.h5')
    model = load_model(model_path)

# Run inference against data that is passed in
def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    results = model.predict(data)
    output = []
    for result in results:
        output.append(construct_output(result))
    return output

# Utility function to construct output data per item passed in
def construct_output(result):
    result_index = np.argmax(result)
    result_value = result[result_index]
    output = { 'value': str(result_index) }
    output['certainty'] = result[result_index].item()
    possibilities = {}
    for i, val in enumerate(result): 
        possibilities[i] = val.item() 
    output['possibilities'] = possibilities    
    return output

Next, we configure the environment that our service will be running in.  This will include adding conda packages for both `tensorflow` and `keras`.  We 

In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies()
myenv.add_conda_package("tensorflow")
myenv.add_conda_package("keras")

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())
    
# Review environment file
with open("myenv.yml","r") as f:
    print(f.read())

Once that is in place, we can create the deploy configuration, which will be created leveraging the `AciWebservice` class.  This will allow our webservice to be launched on Azure Container Instances.

In [None]:
from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "MNIST",  "method" : "keras"}, 
                                               description='Predict MNIST with keras')

### Executing the Deploy

Now that we have configured the deploy, we can now execute the deploy.  This will take some time to complete (around 10 minutes).

In [None]:
%%time
from azureml.core.webservice import Webservice
from azureml.core.model import InferenceConfig

inference_config = InferenceConfig(runtime= "python", 
                                   entry_script="score.py",
                                   conda_file="myenv.yml")

service = Model.deploy(workspace=ws, 
                       name='keras-mnist-svc', 
                       models=[amlModel], 
                       inference_config=inference_config, 
                       deployment_config=aciconfig)

service.wait_for_deployment(show_output=True)

We can get the URL:

In [None]:
print(service.scoring_uri)

## Utilizing the Deployed Service

Now that the service has been deployed, we can validate it using two different approaches:

* Using the Azure ML Python SDK
* Using it as a REST endpoint

### Testing with the SDK

If we have a reference to the `service` object, we can simply call `service.run` and pass in the data we want to test:

In [None]:
import json

# Get a sample index
sample_indices = np.random.permutation(test_images.shape[0])[0:1]

# Structure input data
test_samples = json.dumps({"data": test_images[sample_indices].tolist()})
print("JSON Input: " + test_samples)
test_samples = bytes(test_samples, encoding='utf8')

# Execute the predictions
results = service.run(input_data=test_samples)

# Utility function to display the result
def display_result(image, result):
    fig = plt.figure(figsize=(12, 5))
    grid = plt.GridSpec(1, 3, wspace=0.4, hspace=0.3)
    plt.subplot(grid[0, 0])
    plt.imshow(image, cmap='gray_r')
    print("\n\n")
    print(f'Predicted Value: {result["value"]}')
    print(f'Certainty: {str(result["certainty"])}')
    print(f'Raw Result: {str(result)}')
 
# Show the results
for i, val in enumerate(results):
    display_result(test_images[sample_indices[i]], val)


### Testing via HTTP

Now that we have tested via the SDK, we can now validate this using any tool that can call the REST endpoint.  We will leverage the `requests` module in Python:

In [None]:
import requests

# Get a sample index
sample_indices = np.random.permutation(test_images.shape[0])[0:1]
    
# Structure input data
test_samples = json.dumps({"data": test_images[sample_indices].tolist()})
print("JSON Input: " + test_samples)
test_samples = bytes(test_samples, encoding='utf8')

print("INPUT DATA: ", test_samples)

# Set the header and perform a POST request
headers = {'Content-Type':'application/json'}
resp = requests.post(service.scoring_uri, test_samples, headers=headers)

print("POST to url", service.scoring_uri)

# Read the returned JSON data
results = json.loads(resp.text)

# Show the results
for i, val in enumerate(results):
    display_result(test_images[sample_indices[i]], val)