## Develop Code for Real-Time Deployment
Deployment of a model registered in an AML workspace to a real-time endpoint requires inclusion of a scoring script that loads the model of interest and formats/feeds incoming data through before returning a response. Developing this script using a `LocalWebservice` can be effective for troubleshooting any issues. [More details can be found here.](
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-troubleshoot-deployment-local)

In [None]:
#Import required packages
from azureml.core import Workspace,Environment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

In [None]:
# Connect to AML Workspace
ws = Workspace.from_config()

#Select AML Compute Cluster
cpu_cluster_name = 'cpucluster'

# Verify that cluster does not exist already
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Found an existing cluster, using it instead.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D3_V2',
                                                           min_nodes=0,
                                                           max_nodes=1)
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)
    cpu_cluster.wait_for_completion(show_output=True)
    
#Get default datastore
default_ds = ws.get_default_datastore()

The scoring script associated with the deployed model should have a structure similar to what is shown below. All scoring files need to implement two basic methods: `init()` and `main()`. The `init()` call is responsible for loading the model artifact from the AML registry into memory. The `main()` call is responsible for formatting user-sent data, feeding it into the model, and formatting results before returning a response to the user.

In [None]:
%%writefile scoring_scripts/score.py

import os
import numpy as np
import pandas as pd
import joblib
import h5py

import tensorflow as tf

from tensorflow.keras.layers import Input, Dropout
from keras.layers.core import Dense 
from keras.models import Model, Sequential, load_model
from keras import regularizers
from keras.models import model_from_json

def init():
    global model
    global scaler
    global init_error
    
    try:

        init_error = None

        scaler_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model_files', 'scaler.pkl')
        model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model_files', 'anomaly_detection_encoder_model.h5')
        
        print('Loading scaler from:', scaler_path)
        scaler = joblib.load(scaler_path)
        print(scaler)

        print('Loading model from:', model_path)
        model = load_model(model_path)
        print(model)

    except Exception as e:
        init_error = e
        print(e)
        
# note you can pass in multiple rows for scoring
def run(raw_data):

    if init_error is not None:
        return 'Init error: {}'.format(str(init_error))

    try:
        print("Received input:", raw_data)
    
        input_df = pd.read_json(raw_data, orient='values')
        print(input_df)
    
        sensor_readings = np.array(input_df)
        scaled_sensor_readings = scaler.transform(sensor_readings.reshape(1,-1))

        pred_sensor_readings = model.predict(scaled_sensor_readings)
        score = np.mean(np.abs(scaled_sensor_readings - pred_sensor_readings[0]))

        if score > 0.01:
            print('WARNING! Abnormal conditions detected')
            return 1
        else:
            print('Everything is ok')
            return 0

    except Exception as e:
        error = str(e)
        return error


if __name__ == "__main__":
    # Test scoring
    init()
    test_row = '[[14.23, 41, 14.4, 318.50, 601.95]]'
    prediction = run(test_row, {})
    print("Test result: ", prediction)

Test deployment of the model as a `LocalWebservice`.

In [None]:
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import LocalWebservice


# Create inference configuration based on the environment definition and the entry script
myenv = Environment.get(ws, 'tf_keras_autoencoder_env')
inference_config = InferenceConfig(entry_script="scoring_scripts/score.py", environment=myenv)
# Create a local deployment, using port 8890 for the web service endpoint
deployment_config = LocalWebservice.deploy_configuration(port=8890)
model = Model(ws, name='Autoencoder_PredMaintenance')
# Deploy the service
service = Model.deploy(
    ws, "autoencoderpredmaintenance", [model], inference_config, deployment_config)
# Wait for the deployment to complete
service.wait_for_deployment(True)
# Display the port that the web service is available on
print(service.port)

Submit predictions to the `LocalWebservice` to confirm the deployed model operates as expected. Note you can modify the scoring script and restart the service using `service.reload()`.

In [None]:
import json

#Test normal operation conditions
test_row = json.dumps([[70, 200, 60.6, 0, 1448.17]])
test_sample = bytes(test_row, encoding='utf8')
prediction=service.run(input_data=test_sample)
print('Expected Result: 0')
print('Predicted Result: {}'.format(str(prediction)))


#Test failure conditions
test_row = json.dumps([[14.23, 41, 14.4, 318.50, 601.95]])
test_sample = bytes(test_row, encoding='utf8')
prediction=service.run(input_data=test_sample)
print('Expected Result: 1')
print('Predicted Result: {}'.format(str(prediction)))


Once your deployment functions as expected, delete the service.

In [None]:
service.delete()