# Automated ML

Import Dependencies. In the cell below, import all the dependencies that I will need to complete the project.

In [1]:
from azureml.core import Workspace, Experiment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.widgets import RunDetails
from azureml.train.hyperdrive.run import PrimaryMetricGoal
import os
import joblib
from azureml.core.dataset import Dataset
from azureml.train.automl import AutoMLConfig
from azureml.core.model import Model
from azureml.core.webservice import AciWebservice
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig
import requests
import json
from azureml.core.conda_dependencies import CondaDependencies
import sklearn
from azureml.core.model import Model
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType, Int64TensorType
import onnxruntime as rt

## Dataset

### Overview

I use the Heart-Failure Dataset from Kaggle and the task is to predict the mortality of the patients based on their clinical records. I create the workspace and experiment in the Azure ML studio. I check if the computer target already exists. If it does not then it creates one. Finally, I display the first rows of the Heart-Failure Dataset.

In [2]:
#Create a workspace and an experiment in Azureml
ws = Workspace.from_config()
experiment_name = 'Heart-Failure-AutoMlProject'
project_folder = './Heart-Failure-project'

experiment=Experiment(ws, experiment_name)

#Check if the cluster exists if it does not then create one.
# choose a name for your cluster
cluster_name = "cpu-cluster"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS3_V2', 
                                                           max_nodes=4)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it uses the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

# use get_status() to get a detailed status for the current cluster. 
print(compute_target.get_status().serialize())

Creating a new compute target...
Creating
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2020-11-20T19:00:21.318000+00:00', 'errors': None, 'creationTime': '2020-11-20T19:00:17.183430+00:00', 'modifiedTime': '2020-11-20T19:00:33.633076+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_DS3_V2'}


In [3]:
# Get Heart Failure Dataset and add it in pandas
ds = Dataset.get_by_name(ws, name='Heart-Failure')

In [4]:
#Review the first 5 rows in the dataset
ds.take(5).to_pandas_dataframe()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
0,75.0,0,582,0,20,1,265000.0,1.9,130,1,0,4,1
1,55.0,0,7861,0,38,0,263358.03,1.1,136,1,0,6,1
2,65.0,0,146,0,20,0,162000.0,1.3,129,1,1,7,1
3,50.0,1,111,0,20,0,210000.0,1.9,137,1,0,7,1
4,65.0,1,160,1,20,0,327000.0,2.7,116,0,0,8,1


In [5]:
#Split Dataset to Training and Testing Datasets
training_data, test_data = ds.random_split(percentage=0.8, seed=223)

#Create the Test Data into pandas_dataframe
df_test = test_data.to_pandas_dataframe()
y_test = df_test['DEATH_EVENT']
X_test = df_test.drop(['DEATH_EVENT'], axis=1)

## AutoML Configuration
It is used an AutoML for the classification problem using the Heart Failure Dataset. The AutoML run will look for the best model in relation to the accuracy. The target feature is the "DEATH_EVENT". The experiment timeout is 30 mins and the maximum number of iterations which are executed in parallel are 5. 

In [6]:
# AutoML settings
automl_settings = {
    "experiment_timeout_minutes": 30,
    "max_concurrent_iterations": 5,
    "primary_metric" : 'accuracy',
    "enable_onnx_compatible_models": True
}

# AutoML config
automl_config = AutoMLConfig(compute_target=compute_target,
                             task = "classification",
                             training_data=training_data,
                             validation_data=test_data,
                             label_column_name="DEATH_EVENT",   
                             path = project_folder,
                             enable_early_stopping= True,
                             featurization= 'auto',
                             debug_log = "automl_errors.log",
                             **automl_settings
                            )

In [7]:
# Submit experiment
Automl_run = experiment.submit(automl_config)

Running on remote.


## Run Details

`RunDetails` widget is used to show the different experiments.

In [8]:
RunDetails(Automl_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Best Model

Getb the best model from the automl experiments and display all the properties of the model.



In [16]:
best_run, onnx_mdl = Automl_run.get_output(return_onnx_model=True)
print(best_run)
print(onnx_mdl)

Run(Experiment: Heart-Failure-AutoMlProject,
Id: AutoML_77d4339f-6fc3-481c-a70a-a0d97df937ff_38,
Type: azureml.scriptrun,
Status: Completed)
ir_version: 6
producer_name: "AutoML"
producer_version: "1.17.0"
domain: "ai.onnx"
model_version: 0
doc_string: "{\"AutoMLSDKVer\": \"1.17.0\", \"ExperimentName\": \"Heart-Failure-AutoMlProject\", \"RunId\": \"AutoML_77d4339f-6fc3-481c-a70a-a0d97df937ff_38\", \"PipeId\": \"__AutoML_Ensemble__\"}"
graph {
  node {
    input: "age"
    output: "variable_c0_t0"
    name: "Imputer"
    op_type: "Imputer"
    attribute {
      name: "imputed_value_floats"
      floats: 61.08779525756836
      type: FLOATS
    }
    attribute {
      name: "replaced_value_float"
      f: nan
      type: FLOAT
    }
    domain: "ai.onnx.ml"
  }
  node {
    input: "anaemia"
    output: "variable_c1_t0"
    name: "Imputer1"
    op_type: "Imputer"
    attribute {
      name: "imputed_value_floats"
      floats: 0.4238682985305786
      type: FLOATS
    }
    attribute {
  

In [15]:
#Save the best model
from azureml.automl.runtime.onnx_convert import OnnxConverter
onnx_fl_path = "./best_model.onnx"
OnnxConverter.save_onnx_model(onnx_mdl, onnx_fl_path)

In [17]:
#Register onnx model
auto_ml_onx = Model.register(model_path = "best_model.onnx",
                       model_name = "heartfailure",
                       tags = {"onnx": "demo"},
                       description = "Heart Failure",
                       workspace = ws)

Registering model heartfailure


In [None]:
import json
from azureml.automl.core.onnx_convert import OnnxConvertConstants
from azureml.train.automl import constants

if sys.version_info < OnnxConvertConstants.OnnxIncompatiblePythonVersion:
    python_version_compatible = True
else:
    python_version_compatible = False

import onnxruntime
from azureml.automl.runtime.onnx_convert import OnnxInferenceHelper

def get_onnx_res(run):
    res_path = 'onnx_resource.json'
    run.download_file(name=constants.MODEL_RESOURCE_PATH_ONNX, output_file_path=res_path)
    with open(res_path) as f:
        onnx_res = json.load(f)
    return onnx_res

if python_version_compatible:
    test_df = test_dataset.to_pandas_dataframe()
    mdl_bytes = onnx_mdl.SerializeToString()
    onnx_res = get_onnx_res(best_run)

    onnxrt_helper = OnnxInferenceHelper(mdl_bytes, onnx_res)
    pred_onnx, pred_prob_onnx = onnxrt_helper.predict(test_df)

    print(pred_onnx)
    print(pred_prob_onnx)
else:
    print('Please use Python version 3.6 or 3.7 to run the inference helper.')

## Model Deployment

Deploy the registered onnx model using webservice.

In [18]:
%%writefile scoreonx.py
import json
import time
import sys
import os
from azureml.core.model import Model
import numpy as np    # we're going to use numpy to process input and output data
import onnxruntime    # to inference ONNX models, we use the ONNX Runtime

def init():
    global session
    model = Model.get_model_path(model_name = 'heartfailure')
    session = onnxruntime.InferenceSession(model)

def preprocess(input_data_json):
    # convert the JSON data into the tensor input
    return np.array(json.loads(input_data_json)['data']).astype('float64')

def postprocess(result):
    return np.array(result).tolist()

def run(input_data_json):
    try:
        start = time.time()   # start timer
        input_data = preprocess(input_data_json)
        input_name = session.get_inputs()[0].name  # get the id of the first input of the model   
        result = session.run([], {input_name: input_data})
        end = time.time()     # stop timer
        return {"result": postprocess(result),
                "time": end - start}
    except Exception as e:
        result = str(e)
        return {"error": result}

Writing scoreonx.py


In [26]:
#Set up the Environment 
env= best_run.get_environment()
#Install onnx package
env.python.conda_dependencies.add_pip_package("onnxruntime")

In [30]:
#Set up the inference_config
inference_config = InferenceConfig(entry_script='scoreonx.py', environment=env)

In [31]:
#Local Deployment
from azureml.core.webservice import LocalWebservice
local_config = LocalWebservice.deploy_configuration(port=9000)
local_service = Model.deploy(ws, "test", [auto_ml_onx], inference_config, local_config)
local_service.wait_for_deployment(show_output=True)

Downloading model heartfailure:1 to /tmp/azureml_qdbn863_/heartfailure/1
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry 0268a4e06c4f4840a0e70bb63be19da2.azurecr.io
Logging into Docker registry 0268a4e06c4f4840a0e70bb63be19da2.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM 0268a4e06c4f4840a0e70bb63be19da2.azurecr.io/azureml/azureml_15af802a4bfabbdb50c2d4680e433a1b
 ---> ab4bf4c85e09
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 6252a837b762
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjU3ODFiZTRlLTc4NjItNDJmOS04YWU4LWU4NzljNzExMDM5YiIsInJlc291cmNlR3JvdXBOYW1lIjoiYW1sLXF1aWNrc3RhcnRzLTEyNzA0NSIsImFjY291bnROYW1lIjoicXVpY2stc3RhcnRzLXdzLTEyNzA0NSIsIndvcmtzcGFjZUlkIjoiMDI2OGE0ZTAtNmM0Zi00ODQwLWEwZTctMGJiNjNiZTE5ZGEyIn0sIm1vZGVscyI6e30sIm1vZGVsc0luZm8iOnt9fQ== | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 9b7e3bf9f016
 ---> 771ac8d5dcf5
Step

In [34]:
#Set up the deployment_config as webservice
aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1, enable_app_insights=True)

#Deploy the model
service = Model.deploy(
    workspace = ws,
    name = "mywebservice1",
    models = [auto_ml_onx],
    inference_config = inference_config,
    deployment_config = aci_config, overwrite=True)

#wait until deployment is complete
service.wait_for_deployment(show_output = True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running......................................................
Succeeded
ACI service creation operation finished, operation "Succeeded"


In [35]:
if service.state != 'Healthy':
    # run this command for debugging.
    print(service.get_logs())
    service.delete()

In [36]:
#Print the state
print(service.state)

#Print the scoring uri of the service
print(service.scoring_uri)

Healthy
http://da81b11a-8381-4bf2-b3c6-4e4dbc779d2a.southcentralus.azurecontainer.io/score
