# Step 4: Model operationalization & Deployment

In this script, a model is saved as a .model file along with the relevant scheme for deployment. The functions are first tested locally before operationalizing the model using Azure Machine Learning Model Management environment for use in production in realtime.

**Note:** This notebook will take about 1 minute to execute all cells, depending on the compute configuration you have setup. 

In [1]:
## setup our environment by importing required libraries
import json
import os
import shutil
import time

from pyspark.ml import Pipeline
from pyspark.ml.classification import RandomForestClassifier

# for creating pipelines and model
from pyspark.ml.feature import StringIndexer, VectorAssembler, VectorIndexer

# setup the pyspark environment
from pyspark.sql import SparkSession

from azureml.api.schema.dataTypes import DataTypes
from azureml.api.schema.sampleDefinition import SampleDefinition
from azureml.api.realtime.services import generate_schema

# For Azure blob storage access
from azure.storage.blob import BlockBlobService
from azure.storage.blob import PublicAccess

# For logging model evaluation parameters back into the
# AML Workbench run history plots.
import logging
from azureml.logging import get_azureml_logger

amllog = logging.getLogger("azureml")
amllog.level = logging.INFO

# Turn on cell level logging.
%azureml history on
%azureml history show

# Time the notebook execution. 
# This will only make sense if you "Run all cells"
tic = time.time()

logger = get_azureml_logger() # logger writes to AMLWorkbench runtime view
spark = SparkSession.builder.getOrCreate()

# Telemetry
logger.log('amlrealworld.predictivemaintenance.operationalization','true')

History logging enabled
History logging is enabled


<azureml.logging.script_run_request.ScriptRunRequest at 0x7f622700d470>

We need to load the feature data set from memory to construct the operationalization schema. We again will require your storage account name and account key to connect to the blob storage.

In [2]:
# Enter your Azure blob storage details here 
ACCOUNT_NAME = "<your blob storage account name>"

# You can find the account key under the _Access Keys_ link in the 
# [Azure Portal](portal.azure.com) page for your Azure storage container.
ACCOUNT_KEY = "<your blob storage account key>"
#-------------------------------------------------------------------------------------------
# We will create this container to hold the results of executing this notebook.
# If this container name already exists, we will use that instead, however
# This notebook will ERASE ALL CONTENTS.
CONTAINER_NAME = "featureengineering"
FE_DIRECTORY = 'featureengineering_files.parquet'

MODEL_CONTAINER = 'modeldeploy'

# Connect to your blob service     
az_blob_service = BlockBlobService(account_name=ACCOUNT_NAME, account_key=ACCOUNT_KEY)

# Create a new container if necessary, otherwise you can use an existing container.
# This command creates the container if it does not already exist. Else it does nothing.
az_blob_service.create_container(CONTAINER_NAME, 
                                 fail_on_exist=False, 
                                 public_access=PublicAccess.Container)

# create a local path where to store the results later.
if not os.path.exists(FE_DIRECTORY):
    os.makedirs(FE_DIRECTORY)

# download the entire parquet result folder to local path for a new run 
for blob in az_blob_service.list_blobs(CONTAINER_NAME):
    if CONTAINER_NAME in blob.name:
        local_file = os.path.join(FE_DIRECTORY, os.path.basename(blob.name))
        az_blob_service.get_blob_to_path(CONTAINER_NAME, blob.name, local_file)

fedata = spark.read.parquet(FE_DIRECTORY)

fedata.limit(5).toPandas().head(5)


Unnamed: 0,machineID,dt_truncated,volt_rollingmean_3,rotate_rollingmean_3,pressure_rollingmean_3,vibration_rollingmean_3,volt_rollingmean_24,rotate_rollingmean_24,pressure_rollingmean_24,vibration_rollingmean_24,...,error5sum_rollingmean_24,comp1sum,comp2sum,comp3sum,comp4sum,model,age,model_encoded,failure,label_e
0,27,2016-01-01 06:00:00,147.813753,410.546469,103.110374,39.881874,166.464991,449.92176,100.608971,40.31356,...,0.0,504.0,564.0,444.0,399.0,model2,9,"(0.0, 0.0, 1.0)",0.0,0.0
1,27,2016-01-01 03:00:00,161.893907,457.866635,106.67166,42.281086,167.917852,459.85011,99.954524,40.198525,...,0.0,504.0,564.0,444.0,399.0,model2,9,"(0.0, 0.0, 1.0)",0.0,0.0
2,27,2016-01-01 00:00:00,159.216094,466.617543,102.92824,39.135677,169.175332,456.416658,99.402692,39.688645,...,0.0,504.0,564.0,444.0,399.0,model2,9,"(0.0, 0.0, 1.0)",0.0,0.0
3,27,2015-12-31 21:00:00,173.141342,466.089834,102.410363,40.737921,170.269608,453.365861,97.793726,39.614332,...,0.0,503.0,563.0,443.0,398.0,model2,9,"(0.0, 0.0, 1.0)",0.0,0.0
4,27,2015-12-31 18:00:00,173.328305,445.790528,96.623228,39.30975,168.427467,452.489297,96.946852,39.826918,...,0.0,503.0,563.0,443.0,398.0,model2,9,"(0.0, 0.0, 1.0)",0.0,0.0


## Define init and run functions
Start by defining the init() and run() functions as shown in the cell below. Then write them to the score.py file. This file will load the model, perform the prediction, and return the result.

The init() function initializes your web service, loading in any data or models that you need to score your inputs. In the example below, we load in the trained model. This command is run when the Docker container containing your service initializes.
The run() function defines what is executed on a scoring call. In our simple example, we simply load in the input as a data frame, and run our pipeline on the input, and return the prediction.

In [3]:
def init():
    # read in the model file
    from pyspark.ml import PipelineModel
    global pipeline
    
    pipeline = PipelineModel.load(os.environ['AZUREML_NATIVE_SHARE_DIRECTORY']+'pdmrfull.model')
    
def run(input_df):
    import json
    response = ''
    try:
        #Get prediction results for the dataframe
        
        # We'll use the known label, key variables and 
        # a few extra columns we won't need.
        key_cols =['label_e','machineID','dt_truncated', 'failure','model_encoded','model' ]

        # Then get the remaing feature names from the data
        input_features = input_df.columns

        # Remove the extra stuff if it's in the input_df
        input_features = [x for x in input_features if x not in set(key_cols)]
        
        # Vectorize as in model building
        va = VectorAssembler(inputCols=(input_features), outputCol='features')
        data = va.transform(input_df).select('machineID','features')
        score = pipeline.transform(data)
        predictions = score.collect()

        #Get each scored result
        preds = [str(x['prediction']) for x in predictions]
        response = ",".join(preds)
    except Exception as e:
        print("Error: {0}",str(e))
        return (str(e))
    
    # Return results
    print(json.dumps(response))
    return json.dumps(response)

### Create schema and schema file
Create a schema for the input to the web service and generate the schema file. This will be used to create a Swagger file for your web service which can be used to discover its input and sample data when calling it.

In [4]:
# define the input data frame
inputs = {"input_df": SampleDefinition(DataTypes.SPARK, 
                                       fedata.drop("dt_truncated","failure","label_e", "model","model_encoded"))}

json_schema = generate_schema(run_func=run, inputs=inputs, filepath='service_schema.json')


### Test init and run
We can then test the init() and run() functions right here in the notebook, before we decide to actually publish a web service.

In [5]:
# We'll use the known label, key variables and 
# a few extra columns we won't need. (machineID is required)
key_cols =['label_e','dt_truncated', 'failure','model_encoded','model' ]

# Then get the remaining feature names from the data
input_features = fedata.columns
# Remove the extra stuff if it's in the input_df
input_features = [x for x in input_features if x not in set(key_cols)]


# this is an example input data record
input_data = [[114, 163.375732902,333.149484586,100.183951698,44.0958812638,164.114723991,
               277.191815232,97.6289110707,50.8853505161,21.0049565219,67.5287259378,12.9361526861,
               4.61359760918,15.5377738062,67.6519885441,10.528274633,6.94129487555,0.0,0.0,0.0,
               0.0,0.0,489.0,549.0,549.0,564.0,18.0]]

df = (spark.createDataFrame(input_data, input_features))

# test init() in local notebook
init()

# test run() in local notebook
run(df)

"0.0"


'"0.0"'

## Persist model assets

Next we persist the assets we have created to disk for use in operationalization.

In [6]:
# save the schema file for deployment
out = json.dumps(json_schema)
with open(os.environ['AZUREML_NATIVE_SHARE_DIRECTORY'] + 'service_schema.json', 'w') as f:
    f.write(out)

Now we will use `%%writefile` meta command to save the `init()` and `run()` functions to the save the `pdmscore.py` file.

In [7]:
%%writefile {os.environ['AZUREML_NATIVE_SHARE_DIRECTORY']}/pdmscore.py

import json
from pyspark.ml import Pipeline
from pyspark.ml.classification import RandomForestClassifier, DecisionTreeClassifier

# for creating pipelines and model
from pyspark.ml.feature import StringIndexer, VectorAssembler, VectorIndexer

def init():
    # read in the model file
    from pyspark.ml import PipelineModel
    # read in the model file
    global pipeline
    pipeline = PipelineModel.load('pdmrfull.model')
    
def run(input_df):
    response = ''
    try:
       
        # We'll use the known label, key variables and 
        # a few extra columns we won't need.
        key_cols =['label_e','machineID','dt_truncated', 'failure','model_encoded','model' ]

        # Then get the remaing feature names from the data
        input_features = input_df.columns

        # Remove the extra stuff if it's in the input_df
        input_features = [x for x in input_features if x not in set(key_cols)]
        
        # Vectorize as in model building
        va = VectorAssembler(inputCols=(input_features), outputCol='features')
        data = va.transform(input_df).select('machineID','features')
        score = pipeline.transform(data)
        predictions = score.collect()

        #Get each scored result
        preds = [str(x['prediction']) for x in predictions]
        response = ",".join(preds)
    except Exception as e:
        print("Error: {0}",str(e))
        return (str(e))
    
    # Return results
    print(json.dumps(response))
    return json.dumps(response)

if __name__ == "__main__":
    init()
    run("{\"input_df\":[{\"machineID\":114,\"volt_rollingmean_3\":163.375732902,\"rotate_rollingmean_3\":333.149484586,\"pressure_rollingmean_3\":100.183951698,\"vibration_rollingmean_3\":44.0958812638,\"volt_rollingmean_24\":164.114723991,\"rotate_rollingmean_24\":277.191815232,\"pressure_rollingmean_24\":97.6289110707,\"vibration_rollingmean_24\":50.8853505161,\"volt_rollingstd_3\":21.0049565219,\"rotate_rollingstd_3\":67.5287259378,\"pressure_rollingstd_3\":12.9361526861,\"vibration_rollingstd_3\":4.61359760918,\"volt_rollingstd_24\":15.5377738062,\"rotate_rollingstd_24\":67.6519885441,\"pressure_rollingstd_24\":10.528274633,\"vibration_rollingstd_24\":6.94129487555,\"error1sum_rollingmean_24\":0.0,\"error2sum_rollingmean_24\":0.0,\"error3sum_rollingmean_24\":0.0,\"error4sum_rollingmean_24\":0.0,\"error5sum_rollingmean_24\":0.0,\"comp1sum\":489.0,\"comp2sum\":549.0,\"comp3sum\":549.0,\"comp4sum\":564.0,\"age\":18.0}]}")

Writing /azureml-share//pdmscore.py


These files are stored in the `['AZUREML_NATIVE_SHARE_DIRECTORY']` location on the kernel host machine with the model stored in the `3_model_building.ipynb` notebook. In order to share these assets and operationalize the model, we create a new blob container and store a compressed file containing those assets for later retrieval. 

In [8]:
# Compress the operationalization assets for easy blob storage transfer
MODEL_O16N = shutil.make_archive('o16n', 'zip', os.environ['AZUREML_NATIVE_SHARE_DIRECTORY'])

# Create a new container if necessary, otherwise you can use an existing container.
# This command creates the container if it does not already exist. Else it does nothing.
az_blob_service.create_container(MODEL_CONTAINER, 
                                 fail_on_exist=False, 
                                 public_access=PublicAccess.Container)

# Transfer the compressed operationalization assets into the blob container.
az_blob_service.create_blob_from_path(MODEL_CONTAINER, "o16n.zip", str(MODEL_O16N) ) 


# Time the notebook execution. 
# This will only make sense if you "Run All" cells
toc = time.time()
print("Full run took %.2f minutes" % ((toc - tic)/60))

logger.log("Operationalization Run time", ((toc - tic)/60))

Full run took 0.68 minutes


<azureml.logging.script_run_request.ScriptRunRequest at 0x7f6226fc3048>

## Deployment

Once the assets are stored, we can download them into a local compute context for operationalization on an Azure web service.

We demonstrate how to setup this web service this through a CLI window opened in the AML Workbench application. 

### Download the model

To download the model we've saved, follow these instructions on a local computer.

- Open the [Azure Portal](http://portal.azure.com)
- In the left hand pane, click on __All resources__
- Search for the storage account using the name you provided earlier in this notebook. 
- Choose the storage account from search result list, this will open the storage account panel.
- On the storage account panel, choose __Blobs__
- On the Blobs panel choose the container __modeldeploy__
- Select the file o16n.zip and on the properties pane for that blob choose download.

Once downloaded, unzip the file into the directory of your choosing. The zip file contains three deployment assets:

- the `pdmscore.py` file
- a `pdmrfull.model` directory
- the `service_schema.json` file



### Create a model management endpoint 

Create a modelmanagement under your account. We will call this `pdmmodelmanagement`. The remaining defaults are acceptable.

`az ml account modelmanagement create --location <ACCOUNT_REGION> --resource-group <RESOURCE_GROUP> --name pdmmodelmanagement`


### Check environment settings

Show what environment is currently active:

`az ml env show`

If nothing is set, we setup the environment with the existing model management context first: 

` az ml env setup --location <ACCOUNT_REGION> --resource-group <RESOURCE_GROUP> --name pdmmodelmanagement`

then set the current environment:

`az ml env set --resource-group <RESOURCE_GROUP> --cluster-name pdmmodelmanagement`

Check that the environment is now set:

`az ml env show`


### Deploy your web service 

Once the environment is setup, we'll deploy the web service from the CLI.

These commands assume the current directory contains the webservice assets we created in throughout the notebooks in this scenario (`pdmscore.py`, `service_schema.json` and `pdmrfull.model`). If your kernel has run locally, the assets will be in the `os.environ['AZUREML_NATIVE_SHARE_DIRECTORY']`. 

On windows this points to:

```
cd C:\Users\<username>\.azureml\share\<team account>\<Project Name>
```

on linux variants this points to:

```
cd ~\.azureml\share\<team account>\<Project Name>
```


The command to create a web service (`<SERVICE_ID>`) with these operationalization assets in the current directory is:

```
az ml service create realtime -f <filename> -r <TARGET_RUNTIME> -m <MODEL_FILE> -s <SCHEMA_FILE> -n <SERVICE_ID> --cpu 0.1
```

The default cluster has only 2 nodes with 2 cores each. Some cores are taken for system components. AMLWorkbench asks for 1 core per service. To deploy multiple services into this cluster, we specify the cpu requirement in the service create command as (--cpu 0.1) to request 10% of a core. 

For this example, we will call our webservice `amlworkbenchpdmwebservice`. This `SERVICE_ID` must be all lowercase, with no spaces:

```
az ml service create realtime -f pdmscore.py -r spark-py -m pdmrfull.model -s service_schema.json --cpu 0.1 -n amlworkbenchpdmwebservice
```

This command will take some time to execute. 

Once complete, the command returns sample usage commands to test the service for both PowerShell and the cmd prompt. We can execute these commands from the command line as well. For our example:

```
az ml service run realtime -i amlworkbenchpdmwebservice --% -d "{\"input_df\": [{\"rotate_rollingstd_24\": 0.3233426394949046, \"error3sum_rollingmean_24\": 0.0, \"age\": 14, \"machineID\": 45, \"error5sum_rollingmean_24\": 0.0, \"pressure_rollingstd_24\": 0.1945085296751734, \"vibration_rollingstd_24\": 0.36239263228769986, \"rotate_rollingmean_3\": 527.816906803798, \"error1sum_rollingmean_24\": 0.0, \"volt_rollingmean_24\": 185.92637096180658, \"pressure_rollingmean_3\": 117.22597085550017, \"volt_rollingstd_24\": 0.03361414142292652, \"comp1sum\": 474.0, \"comp3sum\": 384.0, \"pressure_rollingmean_24\": 113.56479908060074, \"rotate_rollingstd_3\": 2.2898301915618045, \"volt_rollingmean_3\": 174.88172665757065, \"comp2sum\": 459.0, \"error2sum_rollingmean_24\": 0.0, \"rotate_rollingmean_24\": 470.1219658987775, \"vibration_rollingmean_3\": 39.472146777953654, \"vibration_rollingstd_3\": 0.8102848856599294, \"pressure_rollingstd_3\": 0.010565393835276299, \"error4sum_rollingmean_24\": 0.0, \"volt_rollingstd_3\": 8.308641250692387, \"vibration_rollingmean_24\": 39.93637676066078, \"comp4sum\": 579.0}, {\"rotate_rollingstd_24\": 1.5152162169310932, \"error3sum_rollingmean_24\": 0.0, \"age\": 14, \"machineID\": 45, \"error5sum_rollingmean_24\": 0.0, \"pressure_rollingstd_24\": 0.012495480312639678, \"vibration_rollingstd_24\": 0.21106710997624312, \"rotate_rollingmean_3\": 474.63178724391287, \"error1sum_rollingmean_24\": 0.0, \"volt_rollingmean_24\": 186.1033733765524, \"pressure_rollingmean_3\": 124.26190112949568, \"volt_rollingstd_24\": 0.7740120822459206, \"comp1sum\": 474.0, \"comp3sum\": 384.0, \"pressure_rollingmean_24\": 112.46729566613514, \"rotate_rollingstd_3\": 13.920898245623066, \"volt_rollingmean_3\": 188.406673928196, \"comp2sum\": 459.0, \"error2sum_rollingmean_24\": 0.0, \"rotate_rollingmean_24\": 461.1030486200735, \"vibration_rollingmean_3\": 38.869583185731614, \"vibration_rollingstd_3\": 1.9805973022526275, \"pressure_rollingstd_3\": 1.7895872952762106, \"error4sum_rollingmean_24\": 0.0, \"volt_rollingstd_3\": 4.60785082568852, \"vibration_rollingmean_24\": 39.96976455089771, \"comp4sum\": 579.0}, {\"rotate_rollingstd_24\": 2.017971138478601, \"error3sum_rollingmean_24\": 0.0, \"age\": 14, \"machineID\": 45, \"error5sum_rollingmean_24\": 0.0, \"pressure_rollingstd_24\": 0.2620300574897778, \"vibration_rollingstd_24\": 0.16523682934622702, \"rotate_rollingmean_3\": 454.8717742309143, \"error1sum_rollingmean_24\": 0.0, \"volt_rollingmean_24\": 184.4934951791266, \"pressure_rollingmean_3\": 123.02912082922734, \"volt_rollingstd_24\": 0.4103068092842077, \"comp1sum\": 473.6666666666667, \"comp3sum\": 383.6666666666667, \"pressure_rollingmean_24\": 110.24028050598271, \"rotate_rollingstd_3\": 15.91959183377542, \"volt_rollingmean_3\": 171.32900821497492, \"comp2sum\": 458.6666666666667, \"error2sum_rollingmean_24\": 0.0, \"rotate_rollingmean_24\": 458.14752146073414, \"vibration_rollingmean_3\": 37.71234613693027, \"vibration_rollingstd_3\": 2.3594190696788924, \"pressure_rollingstd_3\": 1.808640841551748, \"error4sum_rollingmean_24\": 0.0, \"volt_rollingstd_3\": 7.16544669362819, \"vibration_rollingmean_24\": 39.621269267841434, \"comp4sum\": 578.6666666666666}]}"
```

This submits 3 records to the model through the web service, and returns predictioned output labels for each of the three rows:
```
"0.0,0.0,0.0"
```

Indicating that these records are not predicted to fail with in the requested time.

## Conclusion
