# Deploying a Subscription Classifier
I built a classifier to predict if the client will subscribe to a term deposit with the bank, and deployed it to an Azure Container Instance (ACI). Here are the details of the [dataset](https://archive.ics.uci.edu/ml/datasets/bank+marketing).

In [1]:
import logging
import os
import csv

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from sklearn import datasets
import pkg_resources

import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.core.dataset import Dataset

from azureml.pipeline.steps import AutoMLStep

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.28.0


## Setup

#### Create an Azure Workspace

In [2]:
# from dotenv import load_dotenv

# load_dotenv()
# ws = Workspace.create(name='bank_marketing',
#                subscription_id=os.getenv('subscription_id'),
#                resource_group='rg20210512',
#                create_resource_group=True,
#                location='westus2'
#                )
ws = Workspace.from_config()

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


#### Create an Azure ML experiment

In [3]:
experiment_name = 'pipeline'

experiment = Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
pipeline,bank_marketing,Link to Azure Machine Learning studio,Link to Documentation


#### Create or Attach an AmlCompute cluster

In [4]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.compute_target import ComputeTargetException

# NOTE: update the cluster name to match the existing cluster
# Choose a name for your CPU cluster
cluster_name = "cpu"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, 
                                   name=cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', # for GPU, use "STANDARD_NC6"
                                                           #vm_priority = 'lowpriority', # optional
                                                           max_nodes=4)
    compute_target = ComputeTarget.create(ws, 
                                          cluster_name, 
                                          compute_config)

compute_target.wait_for_completion(show_output=True, 
                                   min_node_count = 1, 
                                   timeout_in_minutes = 10)

Found existing cluster, use it.
Succeeded....................................................................................................................
AmlCompute wait for completion finished

Wait timeout has been reached
Current provisioning state of AmlCompute is "Succeeded" and current node count is "0"


## Data

In [5]:
from train import data_split
from azureml.core.dataset import Dataset 

train_data, val_data, test_data = data_split()
datastore = ws.get_default_datastore()
train_ds = Dataset.Tabular.register_pandas_dataframe(dataframe=train_data, 
                                                     target=datastore, 
                                                     name='train_data')
val_ds = Dataset.Tabular.register_pandas_dataframe(dataframe=val_data, 
                                                   target=datastore, 
                                                   name='val_data')
test_ds = Dataset.Tabular.register_pandas_dataframe(dataframe=test_data, 
                                                    target=datastore, 
                                                    name='test_data')

Method register_pandas_dataframe: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to managed-dataset/230b0784-664d-4075-89ce-26c3740e3742/
Successfully uploaded file to datastore.
Creating and registering a new dataset.


Method register_pandas_dataframe: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Successfully created and registered a new dataset.
Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to managed-dataset/d4f3c044-4821-4fc4-8ccd-4c09e7e876d3/
Successfully uploaded file to datastore.
Creating and registering a new dataset.


Method register_pandas_dataframe: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Successfully created and registered a new dataset.
Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to managed-dataset/176234b1-c410-4ba9-b5b1-6a9dfbaa366a/
Successfully uploaded file to datastore.
Creating and registering a new dataset.
Successfully created and registered a new dataset.


#### Review the Dataset Result

In [6]:
train_ds.take(5).to_pandas_dataframe()

Unnamed: 0,job_blue-collar,job_entrepreneur,job_housemaid,job_management,job_retired,job_self-employed,job_services,job_student,job_technician,job_unemployed,...,age,duration,campaign,pdays,previous,emp_var_rate,cons_price_idx,cons_conf_idx,euribor3m,nr_employed
0,0,1,0,0,0,0,0,0,0,0,...,-0.194227,1.445235,-0.565922,-5.06384,1.671136,-1.134279,0.779734,0.475915,-1.570139,-2.428157
1,0,0,0,0,0,0,0,0,1,0,...,0.573445,0.079895,-0.565922,0.195414,-0.349494,-1.197935,-0.864955,-1.425496,-1.267445,-0.940281
2,0,0,0,0,0,0,0,0,0,0,...,-0.674021,2.455741,3.405226,0.195414,-0.349494,-1.197935,-1.17938,-1.231034,-1.37065,-0.940281
3,1,0,0,0,0,0,0,0,0,0,...,-0.76998,-0.355933,-0.565922,0.195414,-0.349494,0.839061,0.591424,-0.474791,0.770116,0.84517
4,1,0,0,0,0,0,0,0,0,0,...,0.381527,-0.81876,0.156105,0.195414,-0.349494,0.839061,0.591424,-0.474791,0.772999,0.84517


## Train

In [7]:
from azureml.train.automl import AutoMLConfig

automl_settings = {
    "experiment_timeout_hours" : 0.5,
    "enable_early_stopping" : True,
    "iteration_timeout_minutes": 5,
    "max_concurrent_iterations": 4,
    "max_cores_per_iteration": -1,
    "primary_metric": 'AUC_weighted',
    "featurization": 'off'
}

automl_config = AutoMLConfig(task = 'classification',
                             debug_log = 'automl_errors.log',
                             compute_target=compute_target,
                             experiment_exit_score = 0.95,
                             enable_onnx_compatible_models=True,
                             training_data = train_ds,
                             label_column_name = 'y_yes',
                             validation_data = val_ds,
                             **automl_settings
                            )
automl_run = experiment.submit(automl_config, show_output=True)

Submitting remote run.
No run_configuration provided, running on cpu with default configuration
Running on remote compute: cpu


Experiment,Id,Type,Status,Details Page,Docs Page
pipeline,AutoML_bbc83d66-b31c-4b81-83e3-97e04d8b7849,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation



Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
DATA GUARDRAILS: 

TYPE:         Class balancing detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and all classes are balanced in your training data.
              Learn more about imbalanced data: https://aka.ms/AutomatedMLImbalancedData

****************************************************************************************************

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   

#### Create Pipeline and AutoMLStep

In [8]:
from azureml.pipeline.core import PipelineData, TrainingOutput

metrics_output_name = 'metrics_output'
best_model_output_name = 'best_model_output'

metrics_data = PipelineData(name='metrics_data',
                           datastore=datastore,
                           pipeline_output_name=metrics_output_name,
                           training_output=TrainingOutput(type='Metrics'))
model_data = PipelineData(name='model_data',
                           datastore=datastore,
                           pipeline_output_name=best_model_output_name,
                           training_output=TrainingOutput(type='Model'))

Create an AutoMLStep.

In [9]:
automl_step = AutoMLStep(
    name='automl_module',
    automl_config=automl_config,
    outputs=[metrics_data, model_data],
    allow_reuse=True)

In [10]:
from azureml.pipeline.core import Pipeline
pipeline = Pipeline(
    description="pipeline_with_automlstep",
    workspace=ws,    
    steps=[automl_step])

In [11]:
pipeline_run = experiment.submit(pipeline)



Created step automl_module [f0c491bf][49fb7713-19ed-480a-b65c-57715aed6287], (This step will run and generate new outputs)
Submitted PipelineRun bd09ed5a-bf59-4f9c-ae34-8e0252220ba5
Link to Azure Machine Learning Portal: https://ml.azure.com/runs/bd09ed5a-bf59-4f9c-ae34-8e0252220ba5?wsid=/subscriptions/45a69fd7-1b5c-4963-a9c8-1c33e27e9b14/resourcegroups/rg20210512/workspaces/bank_marketing&tid=10e19cba-5b4d-42f0-a5b1-0e066efe7fe1


In [12]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', …

In [13]:
pipeline_run.wait_for_completion()

PipelineRunId: bd09ed5a-bf59-4f9c-ae34-8e0252220ba5
Link to Azure Machine Learning Portal: https://ml.azure.com/runs/bd09ed5a-bf59-4f9c-ae34-8e0252220ba5?wsid=/subscriptions/45a69fd7-1b5c-4963-a9c8-1c33e27e9b14/resourcegroups/rg20210512/workspaces/bank_marketing&tid=10e19cba-5b4d-42f0-a5b1-0e066efe7fe1
PipelineRun Status: Running


StepRunId: e710b521-3912-4e89-9220-4e1b9968b806
Link to Azure Machine Learning Portal: https://ml.azure.com/runs/e710b521-3912-4e89-9220-4e1b9968b806?wsid=/subscriptions/45a69fd7-1b5c-4963-a9c8-1c33e27e9b14/resourcegroups/rg20210512/workspaces/bank_marketing&tid=10e19cba-5b4d-42f0-a5b1-0e066efe7fe1
StepRun( automl_module ) Status: Running





StepRun(automl_module) Execution Summary
StepRun( automl_module ) Status: Finished



PipelineRun Execution Summary
PipelineRun Status: Finished
{'runId': 'bd09ed5a-bf59-4f9c-ae34-8e0252220ba5', 'status': 'Completed', 'startTimeUtc': '2021-05-13T01:17:09.048799Z', 'endTimeUtc': '2021-05-13T01:42:57.734362Z', 'properties': {'azureml.runsource': 'azureml.PipelineRun', 'runSource': 'SDK', 'runType': 'SDK', 'azureml.parameters': '{}'}, 'inputDatasets': [], 'outputDatasets': [], 'logFiles': {'logs/azureml/executionlogs.txt': 'https://bankmarkstorage8e1b54384.blob.core.windows.net/azureml/ExperimentRun/dcid.bd09ed5a-bf59-4f9c-ae34-8e0252220ba5/logs/azureml/executionlogs.txt?sv=2019-02-02&sr=b&sig=wJkcSvjmQfF1KCKaKcX6cOg3LoS0MCIkawLyZx8NzXQ%3D&st=2021-05-13T01%3A07%3A14Z&se=2021-05-13T09%3A17%3A14Z&sp=r', 'logs/azureml/stderrlogs.txt': 'https://bankmarkstorage8e1b54384.blob.core.windows.net/azureml/ExperimentRun/dcid.bd09ed5a-bf59-4f9c-ae34-8e0252220ba5/logs/azureml/stderrlogs.txt?sv=2019-02

'Finished'

## Results

#### Retrieve the metrics of all child runs

In [14]:
metrics_output = pipeline_run.get_pipeline_output(metrics_output_name)
num_file_downloaded = metrics_output.download('.', show_progress=True)

Downloading azureml/e710b521-3912-4e89-9220-4e1b9968b806/metrics_data
Downloaded azureml/e710b521-3912-4e89-9220-4e1b9968b806/metrics_data, 1 files out of an estimated total of 1


In [15]:
import json
with open(metrics_output._path_on_datastore) as f:
    metrics_output_result = f.read()
    
deserialized_metrics_output = json.loads(metrics_output_result)
df = pd.DataFrame(deserialized_metrics_output)
df

Unnamed: 0,e710b521-3912-4e89-9220-4e1b9968b806_2,e710b521-3912-4e89-9220-4e1b9968b806_0,e710b521-3912-4e89-9220-4e1b9968b806_12,e710b521-3912-4e89-9220-4e1b9968b806_7,e710b521-3912-4e89-9220-4e1b9968b806_10,e710b521-3912-4e89-9220-4e1b9968b806_13,e710b521-3912-4e89-9220-4e1b9968b806_8,e710b521-3912-4e89-9220-4e1b9968b806_6,e710b521-3912-4e89-9220-4e1b9968b806_11,e710b521-3912-4e89-9220-4e1b9968b806_14,...,e710b521-3912-4e89-9220-4e1b9968b806_3,e710b521-3912-4e89-9220-4e1b9968b806_36,e710b521-3912-4e89-9220-4e1b9968b806_42,e710b521-3912-4e89-9220-4e1b9968b806_20,e710b521-3912-4e89-9220-4e1b9968b806_25,e710b521-3912-4e89-9220-4e1b9968b806_26,e710b521-3912-4e89-9220-4e1b9968b806_37,e710b521-3912-4e89-9220-4e1b9968b806_43,e710b521-3912-4e89-9220-4e1b9968b806_38,e710b521-3912-4e89-9220-4e1b9968b806_45
log_loss,[0.4066249672240658],[0.313727550648466],[0.40998296756893426],[0.47908879665535625],[0.4593036717591154],[0.39754159283602575],[0.4578131608552609],[0.42744558909886166],[0.5107172590110861],[0.5587991738160417],...,[0.5092827556440753],[0.389979879122874],[0.3245103471973568],[0.5165744266224434],[0.3609454275707394],[0.4098915954154826],[0.3374671481862528],[0.3108743074095318],[0.3379582462933538],[0.4126793703771991]
average_precision_score_macro,[0.7794560171602452],[0.8161259749759122],[0.7727392710991922],[0.7278371907585033],[0.7416968540611147],[0.791995374015688],[0.7515974539944923],[0.765994518494427],[0.7138522398707068],[0.7049832371988719],...,[0.7133671932769448],[0.7971970051422962],[0.8076636000708397],[0.7149683267930131],[0.7780900353394151],[0.7880177565900948],[0.805746513567974],[0.8181805979369151],[0.8005671027456742],[0.7853365845394036]
norm_macro_recall,[0.7153654206055953],[0.7845314074571716],[0.728480253371083],[0.5715743317817554],[0.6779170065742117],[0.7294723835116848],[0.5529290033110994],[0.6150996928835357],[0.4744841403138347],[0.47692487643361003],...,[0.49825687413023667],[0.7680262968472575],[0.7790297039205336],[0.5016327558903977],[0.7297159172705023],[0.7500131964105763],[0.7544088008061807],[0.7711292528432265],[0.7604995441239981],[0.7404320024953213]
accuracy,[0.8409808205875212],[0.856275795095897],[0.8509346928866229],[0.8380674921097354],[0.8145180869142996],[0.7987375576596262],[0.831755280407866],[0.8288419519300801],[0.775673707210488],[0.8446224811847536],...,[0.8259286234522942],[0.8570041272153436],[0.8565185724690458],[0.8169458606457878],[0.8349113862588007],[0.8409808205875212],[0.8414663753338189],[0.8477785870356883],[0.8366108278708424],[0.835882495751396]
average_precision_score_micro,[0.8928679822149159],[0.9522242046731907],[0.8904593710966416],[0.8407767967780616],[0.8638242012279752],[0.9084199982001964],[0.8505023997393667],[0.8846159687012316],[0.8094620738150652],[0.8134721770425539],...,[0.804618402822344],[0.9461618341771405],[0.9509598079398294],[0.8175143171429147],[0.927199848759614],[0.9353594863996713],[0.936926202900912],[0.9506755344263067],[0.9402463744084354],[0.934085426152748]
precision_score_macro,[0.6911830714972748],[0.7109632472301868],[0.7000871216190123],[0.6691249292711862],[0.6700618270395791],[0.670483290053097],[0.6617009669089786],[0.669068039727762],[0.6194435384503878],[0.6601754354863014],...,[0.6489529499947997],[0.7095890366111972],[0.7105029758431556],[0.6437567685574507],[0.6891136132999045],[0.6957368030186714],[0.6966321646939193],[0.7030822391721804],[0.6943097629680943],[0.6911707859196443]
f1_score_micro,[0.8409808205875212],[0.856275795095897],[0.8509346928866229],[0.8380674921097354],[0.8145180869142997],[0.7987375576596262],[0.8317552804078661],[0.8288419519300801],[0.775673707210488],[0.8446224811847536],...,[0.8259286234522942],[0.8570041272153436],[0.8565185724690458],[0.8169458606457878],[0.8349113862588007],[0.8409808205875212],[0.8414663753338189],[0.8477785870356883],[0.8366108278708424],[0.835882495751396]
AUC_weighted,[0.9185340587360238],[0.9462330246173043],[0.9302443735303998],[0.8763643289025385],[0.8912633763616296],[0.9291475718604539],[0.8930778828158741],[0.905934485819857],[0.8591933993953645],[0.8276491794231969],...,[0.845709666970584],[0.9423046931234703],[0.9453986515667739],[0.8372540668938049],[0.9287585776668745],[0.935440160276405],[0.9382809875713806],[0.94521180238975],[0.9344678247516677],[0.9295089735591918]
AUC_micro,[0.9087115818363388],[0.9495355490260877],[0.9177028910665176],[0.8678397546740243],[0.8792827676219741],[0.9023783871908324],[0.8774327910715923],[0.8976625295492598],[0.8525921686113906],[0.86416284744675],...,[0.8525970607021824],[0.9440264359155851],[0.9485430440038851],[0.855257061394148],[0.9251302637055019],[0.932939985657333],[0.933943630500736],[0.947573172269201],[0.9363939196380371],[0.930281399545837]
f1_score_weighted,[0.8643743322832139],[0.8772423798017513],[0.8720403157421499],[0.8584474358460511],[0.844055103498409],[0.8332399085925508],[0.8533566090622818],[0.853050212164036],[0.8112529167290097],[0.8597387023952033],...,[0.8474463493718358],[0.8774391473507566],[0.8773098090333062],[0.8412423031302676],[0.8601910061163248],[0.8651166733436184],[0.8655678104919785],[0.8706135435973935],[0.8620722037858756],[0.8611316336379601]


#### Explain the best model

In [16]:
from azureml.train.automl.run import AutoMLRun

# # Retrieve best model from Pipeline Run
# best_model_output = pipeline_run.get_pipeline_output(best_model_output_name)
# num_file_downloaded = best_model_output.download('.', show_progress=True)

# import pickle

# with open(best_model_output._path_on_datastore, "rb" ) as f:
#     best_model = pickle.load(f)
# print(best_model)
# print(best_model.steps)

# Retrieve the automl step from pipeline_run
automl_run_id = pipeline_run.find_step_run('automl_module')[0].id
remote_run = AutoMLRun(experiment=experiment, run_id=automl_run_id)
remote_run

Experiment,Id,Type,Status,Details Page,Docs Page
pipeline,e710b521-3912-4e89-9220-4e1b9968b806,azureml.StepRun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [17]:
# Retrieve the best run and the best model
best_run, best_model = remote_run.get_output()
best_run, best_model

(Run(Experiment: pipeline,
 Id: e710b521-3912-4e89-9220-4e1b9968b806_49,
 Type: azureml.scriptrun,
 Status: Completed),
 Pipeline(memory=None,
          steps=[('prefittedsoftvotingclassifier',
                  PreFittedSoftVotingClassifier(classification_labels=array([0, 1]), estimators=[('31', Pipeline(memory=None, steps=[('StandardScalerWrapper', StandardScalerWrapper(
     copy=True,
     with_mean=False,
     with_std=False
 )), ('XGBoostClassifier', XGBoostClassifier(booster='gbtree', colsample_bytree=0.6, eta=0.05, gr...
     weight_column_name=None,
     cv_split_column_names=None,
     enable_streaming=None,
     timeseries_param_dict=None,
     gpu_training_param_dict={'processing_unit_type': 'cpu'}
 ), random_state=0, reg_alpha=1.3541666666666667, reg_lambda=2.291666666666667, subsample=0.9, tree_method='hist'))], verbose=False))], flatten_transform=None, weights=[0.2, 0.2, 0.2, 0.1, 0.1, 0.2]))],
          verbose=False))

In [18]:
# Wait for the best model explanation run to complete
from azureml.core.run import Run
model_explainability_run_id = remote_run.id + "_" + "ModelExplain"
print(model_explainability_run_id)
model_explainability_run = Run(experiment=experiment, run_id=model_explainability_run_id)
model_explainability_run.wait_for_completion()

e710b521-3912-4e89-9220-4e1b9968b806_ModelExplain


{'runId': 'e710b521-3912-4e89-9220-4e1b9968b806_ModelExplain',
 'target': 'cpu',
 'status': 'Completed',
 'startTimeUtc': '2021-05-13T01:40:49.332831Z',
 'endTimeUtc': '2021-05-13T01:45:33.448507Z',
 'properties': {'azureml.runsource': 'automl',
  'parentRunId': 'e710b521-3912-4e89-9220-4e1b9968b806_49',
  '_azureml.ComputeTargetType': 'amlcompute',
  'ContentSnapshotId': '703072d0-f7dc-476a-9180-11c4e6bda7a0',
  'ProcessInfoFile': 'azureml-logs/process_info.json',
  'ProcessStatusFile': 'azureml-logs/process_status.json',
  'dependencies_versions': '{"azureml-train-automl": "1.28.0", "azureml-train-automl-runtime": "1.28.0", "azureml-train-automl-client": "1.28.0", "azureml-telemetry": "1.28.0", "azureml-model-management-sdk": "1.0.1b6.post1", "azureml-interpret": "1.28.0", "azureml-explain-model": "1.28.0", "azureml-defaults": "1.28.0", "azureml-dataset-runtime": "1.28.0", "azureml-dataprep": "2.15.0", "azureml-dataprep-rslex": "1.13.0", "azureml-dataprep-native": "33.0.0", "azureml-

In [19]:
from azureml.interpret import ExplanationClient
client = ExplanationClient.from_run(best_run)
engineered_explanations = client.download_model_explanation(raw=False)
# Get feature importances
exp_data = engineered_explanations.get_feature_importance_dict()
exp_data



{'duration': 1.36021080140982,
 'nr_employed': 0.6506184929226811,
 'emp_var_rate': 0.5226955466298302,
 'cons_conf_idx': 0.1689837677861738,
 'euribor3m': 0.1653866374664852,
 'pdays': 0.064621272674185,
 'cons_price_idx': 0.061244001994764485,
 'age': 0.05318699305907205,
 'month_may': 0.048228735870395785,
 'campaign': 0.045884424932106344,
 'default_unknown': 0.029163893584115797,
 'job_blue-collar': 0.027993125300153927,
 'education_university_degree': 0.024958753741712802,
 'poutcome_nonexistent': 0.022029419135736524,
 'poutcome_success': 0.020463355184522816,
 'day_of_week_thu': 0.013959146212743977,
 'housing_yes': 0.012774578780432578,
 'contact_telephone': 0.012712526080932058,
 'month_mar': 0.011983939616728567,
 'month_oct': 0.010744626050913968,
 'month_nov': 0.010245941807608948,
 'education_basic_9y': 0.009790422852264103,
 'month_jul': 0.009789682339101847,
 'marital_single': 0.008629293193229299,
 'day_of_week_mon': 0.007217024306643511,
 'day_of_week_tue': 0.00697281

#### Retrieve and Save the Best ONNX Model

In [20]:
from azureml.automl.runtime.onnx_convert import OnnxConverter
from azureml.train.automl import constants
if not os.path.isdir('onnx'):
    os.mkdir('onnx')
onnx_fl_path = "onnx/best_model.onnx"
best_run, onnx_mdl = remote_run.get_output(return_onnx_model=True)
OnnxConverter.save_onnx_model(onnx_mdl, 
                              onnx_fl_path)
res_path = "onnx/onnx_resource.json"
best_run.download_file(name=constants.MODEL_RESOURCE_PATH_ONNX, 
                       output_file_path=res_path)




#### Load ONNX Model

In [21]:
import onnx
import json
onnx_model = onnx.load(onnx_fl_path)
with open(res_path) as f:
    onnx_res = json.load(f)

#### Predict with the ONNX model, using onnxruntime package

In [22]:
import sys
import json
from azureml.automl.core.onnx_convert import OnnxConvertConstants
import onnxruntime
from azureml.automl.runtime.onnx_convert import OnnxInferenceHelper

if sys.version_info < OnnxConvertConstants.OnnxIncompatiblePythonVersion:
    python_version_compatible = True
else:
    python_version_compatible = False

if python_version_compatible:
    test_df = test_ds.to_pandas_dataframe()
    mdl_bytes = onnx_model.SerializeToString()
    onnxrt_helper = OnnxInferenceHelper(mdl_bytes, onnx_res)
    pred_onnx, pred_prob_onnx = onnxrt_helper.predict(test_df)

    print(pred_onnx)
    print(pred_prob_onnx)
else:
    print('Please use Python version 3.6 or 3.7 to run the inference helper.')

[0 0 0 ... 0 1 0]
[[0.91860694 0.08139309]
 [0.95949346 0.04050656]
 [0.94088775 0.05911225]
 ...
 [0.957514   0.04248616]
 [0.15309845 0.8469016 ]
 [0.91075104 0.08924901]]


In [24]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(test_df['y_yes'], pred_onnx)
# Visualize the confusion matrix
pd.DataFrame(cm).style.background_gradient(cmap='Blues', 
                                           low=0, 
                                           high=0.9)

Unnamed: 0,0,1
0,6102,1162
1,55,919


#### Load Test Data

In [25]:
X_test = test_data.drop(columns=['y_yes'])
y_test = test_data['y_yes']

#### Testing the Best Fitted Model

In [26]:
ypred = best_model.predict(X_test)
cm = confusion_matrix(y_test, ypred)

In [27]:
# Visualize the confusion matrix
pd.DataFrame(cm).style.background_gradient(cmap='Blues', 
                                           low=0, 
                                           high=0.9)

Unnamed: 0,0,1
0,6102,1162
1,55,919


## Deploy

#### Publish and run from REST endpoint

In [28]:
published_pipeline = pipeline_run.publish_pipeline(
    name="Bankmarketing Train", 
    description="Training bankmarketing pipeline", 
    version="1.0")

published_pipeline

Name,Id,Status,Endpoint
Bankmarketing Train,31ffd3a1-609c-40a6-9cbb-199a1922ce0b,Active,REST Endpoint


In [29]:
from azureml.core.authentication import InteractiveLoginAuthentication

# Authenticate once again, to retrieve the auth_header so that the endpoint can be used
interactive_auth = InteractiveLoginAuthentication()
auth_header = interactive_auth.get_authentication_header()

In [30]:
import requests

# Get the REST url from the endpoint property of the published pipeline object
rest_endpoint = published_pipeline.endpoint

# Build an HTTP POST request to the endpoint, specifying the authentication header
# Add a JSON payload object with the experiment name and the batch size parameter 
response = requests.post(rest_endpoint, 
                         headers=auth_header, 
                         json={"ExperimentName": "pipeline-rest-endpoint"}
                        )

In [31]:
try:
    response.raise_for_status()
except Exception:    
    raise Exception("Received bad response from the endpoint: {}\n"
                    "Response Code: {}\n"
                    "Headers: {}\n"
                    "Content: {}".format(rest_endpoint, 
                                         response.status_code, 
                                         response.headers, 
                                         response.content))

# Access the Id key from the response dict to get the value of the run id
run_id = response.json().get('Id')
print('Submitted pipeline run: ', run_id)

Submitted pipeline run:  87daaa4f-4e65-4d06-a8b9-606dbf2a7510


In [32]:
from azureml.pipeline.core.run import PipelineRun
from azureml.widgets import RunDetails

# Use the run id to monitor the status of the new run
published_pipeline_run = PipelineRun(
    ws.experiments["pipeline-rest-endpoint"], 
    run_id)
RunDetails(published_pipeline_run).show()

_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', …

#### Register the best model

In [33]:
model_name = best_run.properties['model_name']

script_file_name = 'inference/score.py'

best_run.download_file('outputs/scoring_file_v_1_0_0.py', 
                       'inference/score.py')

In [34]:
description = 'MaxAbsScaler/LightGBM model trained on bank marketing data to predict if a client will subscribe to a term deposit'
tags = None
model = remote_run.register_model(model_name = model_name, 
                                  description = description, 
                                  tags = tags)

print(remote_run.model_id)

e710b52139124e849


#### Deploy the model to ACI

In [37]:
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
from azureml.core.webservice import Webservice
from azureml.core.model import Model
from azureml.core.environment import Environment

inference_config = InferenceConfig(entry_script=script_file_name)

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, 
                                               memory_gb = 1, 
                                               tags = {'area': "bank_marketing", 'type': "classification"}, 
                                               description = 'ACI_service')

aci_service_name = 'bank-marketing-aci'
aci_service = Model.deploy(ws, 
                           aci_service_name, 
                           [model], 
                           inference_config, 
                           aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-05-12 21:05:43-05:00 Creating Container Registry if not exists.
2021-05-12 21:05:45-05:00 Use the existing image.
2021-05-12 21:05:46-05:00 Generating deployment configuration.
2021-05-12 21:05:47-05:00 Submitting deployment to compute..
2021-05-12 21:05:50-05:00 Checking the status of deployment bank-marketing-aci..
2021-05-12 21:10:56-05:00 Checking the status of inference endpoint bank-marketing-aci.
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


In [38]:
aci_service.get_logs()

'2021-05-13T02:10:54,206085600+00:00 - rsyslog/run \n2021-05-13T02:10:54,206085600+00:00 - gunicorn/run \n2021-05-13T02:10:54,283917100+00:00 - iot-server/run \n2021-05-13T02:10:54,420095500+00:00 - nginx/run \nrsyslogd: /azureml-envs/azureml_48d60bd6e7fab6edf5a4021f49cfe5d3/lib/libuuid.so.1: no version information available (required by rsyslogd)\nEdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...\n2021-05-13T02:10:56,276640500+00:00 - iot-server/finish 1 0\n2021-05-13T02:10:56,282134300+00:00 - Exit code 1 is normal. Not restarting iot-server.\nStarting gunicorn 20.1.0\nListening at: http://127.0.0.1:31311 (68)\nUsing worker: sync\nworker timeout is set to 300\nBooting worker with pid: 96\nSPARK_HOME not set. Skipping PySpark Initialization.\nGenerating new fontManager, this may take some time...\nInitializing logger\n2021-05-13 02:11:00,186 | root | INFO | Starting up app insights client\n2021-05-13 02:11:00,188 | root | INFO | Starting up request id generato

## Test

In [39]:
import json
import requests

X_test_json = X_test.to_json(orient='records')
data = "{\"data\": " + X_test_json +"}"
headers = {'Content-Type': 'application/json'}

resp = requests.post(aci_service.scoring_uri, data, headers=headers)

y_pred = json.loads(json.loads(resp.text))['result']

In [40]:
cm = confusion_matrix(y_test, y_pred)
# Visualize the confusion matrix
pd.DataFrame(cm).style.background_gradient(cmap='Blues', 
                                           low=0, 
                                           high=0.9)

Unnamed: 0,0,1
0,6102,1162
1,55,919
