# Converting a Keras model to ONNX

In the steps that follow, you will convert Keras model you just trained to the ONNX format. This will enable you to use this model for classification in a very broad range of environments, outside of Azure Databricks including:

- Web services
- iOS and Android mobile apps
- Windows apps
- IoT devices

Furthermore, ONNX runtimes and libraries are also designed to maximize performance on some of the best hardware in the industry. In this lab, we will compare the Inference performance of the ONNX vs Keras models.

First we will load the trained Keras model from file, and then convert the model to ONNX.

## Load the Keras Model

Load both the saved Keras model, and the associated vectorizer. We will convert the Keras model to ONNX format, however, we also need the vectorizer to transform the input data before making inferences.

In [None]:
import os
import numpy as np
import pandas as pd

np.random.seed(125)

from keras.models import load_model
from sklearn.externals import joblib

output_folder = './output'
model_filename = 'final_model.hdf5'

keras_model = load_model(os.path.join(output_folder, model_filename))
print(keras_model.summary())

vectorizer_name = 'vectorizer'
vectorizer = joblib.load(os.path.join(output_folder, vectorizer_name))
print('{} loaded!'.format(vectorizer_name))

## Convert to ONNX

Convert the loaded Keras model to ONNX format, and save the ONNX model to the deployment folder.

In [None]:
import onnxmltools

deployment_folder = 'deploy'
onnx_export_folder = 'onnx'

# Convert the Keras model to ONNX
onnx_model_name = 'claim_classifier.onnx'
converted_model = onnxmltools.convert_keras(keras_model, onnx_model_name, target_opset=7)

# Save the model locally...
onnx_model_path = os.path.join(deployment_folder, onnx_export_folder)
os.makedirs(onnx_model_path, exist_ok=True)
onnxmltools.utils.save_model(converted_model, os.path.join(onnx_model_path,onnx_model_name))

## Make Inference using the ONNX Model

- Create an ONNX runtime InferenceSession
- Review the expected input shape to make inferences
- Prepare test data
- Make inferences using both the ONNX and the Keras Model on the test data

### ONNX Runtime InferenceSession

In [None]:
import onnxruntime
# Load the ONNX model and observe the expected input shape
onnx_session = onnxruntime.InferenceSession(
    os.path.join(os.path.join(deployment_folder, onnx_export_folder), onnx_model_name))
input_name = onnx_session.get_inputs()[0].name
output_name = onnx_session.get_outputs()[0].name
print('Expected input shape: ', onnx_session.get_inputs()[0].shape)

### Prepare test data

Load the helper code to normalize the test data

In [None]:
import sys
data_location = './data'
sys.path.append(data_location)
import textanalytics as ta

Normalize and transform the test data

In [None]:
test_claim = ['I crashed my car into a pole.']
test_claim = ta.normalize_corpus(test_claim)
test_claim = vectorizer.transform(test_claim)

test_claim = test_claim.toarray().astype(np.float32)
print(test_claim.shape)

### Make Inferences

Make inferences using both the ONNX and the Keras Model on the test data

In [None]:
# Run an ONNX session to classify the sample.
print('ONNX prediction: ', onnx_session.run([output_name], {input_name : test_claim}))

# Use Keras to make predictions on the same sample
print('Keras prediction: ', keras_model.predict(test_claim))

## Compare Inference Performance: ONNX vs Keras

Evaluate the performance of ONNX and Keras by running the same sample 20,000 times. Run the next three cells and compare the performance in your environment.

In [None]:
# Next we will compare the performance of ONNX vs Keras
import timeit
n = 20000

In [None]:
start_time = timeit.default_timer()
for i in range(n):
    keras_model.predict(test_claim)
keras_elapsed = timeit.default_timer() - start_time
print('Keras performance: ', keras_elapsed)

In [None]:
start_time = timeit.default_timer()
for i in range(n):
    onnx_session.run([output_name], {input_name : test_claim})
onnx_elapsed = timeit.default_timer() - start_time
print('ONNX performance: ', onnx_elapsed)
print('ONNX is about {} times faster than Keras'.format(round(keras_elapsed/onnx_elapsed)))

# Deploy ONNX model to Azure Container Instance (ACI)

## Create and connect to an Azure Machine Learning Workspace

Review the workspace config file saved in the previous notebook.

In [None]:
!cat .azureml/config.json

**Create the `Workspace` from the saved config file**

In [None]:
import azureml.core

print(azureml.core.VERSION)

from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print(ws)

## Register the model with Azure Machine Learning

In the following, you register the model and the vectorizer with Azure Machine Learning (which saves a copy in the cloud).

In [None]:
#Register the model and vectorizer
from azureml.core.model import Model

registered_model_name = 'claim_classifier_onnx'
onnx_model_path = os.path.join(os.path.join(deployment_folder, onnx_export_folder), onnx_model_name)

registered_model = Model.register(model_path = onnx_model_path, # this points to a local file
                       model_name = registered_model_name, # this is the name the model is registered with         
                       description = "Claims classification model.",
                       workspace = ws)

print(registered_model.name, registered_model.description, registered_model.version)

output_folder = './output'
vectorizer_name = 'vectorizer'
vectorizer_path = os.path.join(output_folder, vectorizer_name)

registered_vectorizer = Model.register(model_path = vectorizer_path, # this points to a local file
                       model_name = vectorizer_name, # this is the name the model is registered with         
                       description = "Claims classification model vectorizer.",
                       workspace = ws)

print(registered_vectorizer.name, registered_vectorizer.description, registered_vectorizer.version)

## Create the scoring web service

When deploying models for scoring with Azure Machine Learning services, you need to define the code for a simple web service that will load your model and use it for scoring. By convention this service has two methods init which loads the model and run which scores data using the loaded model.

This scoring service code will later be deployed inside of a specially prepared Docker container.

Confirm that that the current directory is the parent directory of the deployment folder

In [None]:
cwd = os.getcwd()
if cwd.endswith(deployment_folder):
    os.chdir('../')

**Save the scoring web service Python file in the deployment folder**

Note that the scoring web service needs both the registered models: the ONNX model, and the vectorizer to make inferences.

In [None]:
%%writefile $deployment_folder/scoring_service.py
import json
import numpy as np
import os
import sys
import urllib.request
import nltk
from sklearn.externals import joblib
from azureml.core.model import Model
import onnxruntime

onnx_model_name = 'claim_classifier_onnx'
vectorizer_name = 'vectorizer'

def init():

    global onnx_session
    global vectorizer
    
    try:
        # Takes at most a couple of minutes to download all NLTK content
        print("downloading nltk.")
        nltk.download("all")
        
        tempFolderName = './resources'
        os.makedirs(tempFolderName, exist_ok=True)
        print('Content files will be saved to {0}'.format(tempFolderName))
        
        base_data_url = 'https://databricksdemostore.blob.core.windows.net/data/05.03/'
        filesToDownload = ['contractions.py', 'textanalytics.py']
        
        for file in filesToDownload:
            data_url = os.path.join(base_data_url, file)
            local_file_path = os.path.join(tempFolderName, file)
            urllib.request.urlretrieve(data_url, local_file_path)
            print('Downloaded file: ', file)
        
        print('Importing textanalytics...')
        sys.path.append(tempFolderName)
        import textanalytics as ta
        print('Done importing textanalytics.')
        
        # Retrieve the path to the model file using the model name
        onnx_model_path = Model.get_model_path(onnx_model_name)
        print('onnx_model_path: ', onnx_model_path)
        
        vectorizer_path = Model.get_model_path(vectorizer_name)
        print('vectorizer_path: ', vectorizer_path)
        
        onnx_session = onnxruntime.InferenceSession(onnx_model_path)
        print('Onnx Inference Session Created!')
        
        vectorizer = joblib.load(vectorizer_path)
        print('Vectorizer Loaded!')
    except Exception as e:
        print(e)

def run(raw_data):
    try:
        print("Received input: ", raw_data)
        
        print('Importing textanalytics...')
        import textanalytics as ta
        print('Done importing textanalytics.')
        
        print('Processing input...')
        input_data = np.array(json.loads(raw_data))
        input_data = ta.normalize_corpus(input_data)
        input_data = vectorizer.transform(input_data)
        input_data = input_data.toarray().astype(np.float32)
        print('Done processing input.')
        
        # Run an ONNX session to classify the input.
        result = onnx_session.run(None, {onnx_session.get_inputs()[0].name: input_data})[0].argmax(axis=1).item()
        # return just the classification index (0 or 1)
        return result
    except Exception as e:
        print(e)
        error = str(e)
        return error

## Package Model

Your scoring service can have dependencies install by using a Conda environment file. Items listed in this file will be conda or pip installed within the Docker container that is created and thus be available to your scoring web service logic.

Build a container image that names the scoring service script, the runtime (python or Spark), the conda file, and the registered models.

Run the following cell. This may take between 5-10 minutes to complete.

In [None]:
# create a Conda dependencies environment file
print("Creating conda dependencies file locally...")
from azureml.core.conda_dependencies import CondaDependencies 
conda_packages = ['numpy', 'scikit-learn']
pip_packages = ['nltk', 'azureml-sdk', 'onnxruntime']
mycondaenv = CondaDependencies.create(conda_packages=conda_packages, pip_packages=pip_packages)

cwd = os.getcwd()
if not cwd.endswith(deployment_folder):
    os.chdir(deployment_folder)
    
conda_file = 'dependencies.yml'
with open(conda_file, 'w') as f:
    f.write(mycondaenv.serialize_to_string())

runtime = 'python'
execution_script = 'scoring_service.py'

# create container image configuration
print("Creating container image configuration...")
from azureml.core.image import ContainerImage
image_config = ContainerImage.image_configuration(execution_script = execution_script, 
                                                  runtime = runtime, conda_file = conda_file)

# create the image
image_name = 'claim-classifier-image'

from azureml.core import Image
image = Image.create(name=image_name, models=[registered_model, registered_vectorizer], 
                     image_config=image_config, workspace=ws)

# wait for image creation to finish
image.wait_for_creation(show_output=True)

os.chdir("..")

## Deploy Model to ACI

Deploy the webservice from the image created in the previous step.

You will see output similar to the following when your web service is ready: SucceededACI service creation operation finished, operation "Succeeded"

Run the following cell. This takes around 5 minutes to complete.

In [None]:
from azureml.core.webservice import AciWebservice, Webservice

aci_config = AciWebservice.deploy_configuration(
    cpu_cores = 1, 
    memory_gb = 1, 
    tags = {'name': 'Claim Classification'}, 
    description = "Classifies a claim as home or auto.")

service_name = "claimclassservice"

aci_service = Webservice.deploy_from_image(deployment_config=aci_config, 
                                           image=image, 
                                           name=service_name, 
                                           workspace=ws)

aci_service.wait_for_deployment(show_output=True)

## Test Deployment

### Make direct calls on the service object

In [None]:
import json

test_claims = ['I crashed my car into a pole.', 
               'The flood ruined my house.', 
               'I lost control of my car and fell in the river.']

for i in range(len(test_claims)):
    result = aci_service.run(json.dumps([test_claims[i]]))
    print('Predicted label for test claim #{} is {}'.format(i+1, result))

### Make HTTP calls to test the deployed Web Service

In order to call the service from a REST client, you need to acquire the scoring URI. Take a note of printed scoring URI, you will need it in the last notebook.

The default settings used in deploying this service result in a service that does not require authentication, so the scoring URI is the only value you need to call this service.

In [None]:
import requests

url = aci_service.scoring_uri
print('ACI Service: Claim Classification scoring URI is: {}'.format(url))
headers = {'Content-Type':'application/json'}

for i in range(len(test_claims)):
    response = requests.post(url, json.dumps([test_claims[i]]), headers=headers)
    print('Predicted label for test claim #{} is {}'.format(i+1, response.text))