# Hyperparameter Tuning using HyperDrive

Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
import azureml.core
from azureml.core import Workspace, Experiment
from azureml.widgets import RunDetails
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import normal, uniform, choice
from azureml.core import Environment
from azureml.core import ScriptRunConfig
import os

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.27.0


## Initialize Workspace

In [2]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

quick-starts-ws-144884
aml-quickstarts-144884
southcentralus
f9d5a085-54dc-4215-9ba6-dad5d86e60a0


## Create an Azure ML experiment

In [3]:
experiment_name = 'loan-prediction-h'
project_folder = './loan-prediction-h-project'

experiment=Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
loan-prediction-h,quick-starts-ws-144884,Link to Azure Machine Learning studio,Link to Documentation


## Create a compute cluster

In [4]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.compute_target import ComputeTargetException

# NOTE: update the cluster name to match the existing cluster
# Choose a name for your CPU cluster
amlcompute_cluster_name = "cpu-cluster-h"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='Standard_DS12_v2',# for GPU, use "STANDARD_NC6"
                                                           #vm_priority = 'lowpriority', # optional
                                                           max_nodes=10)
    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)
    compute_target.wait_for_completion(show_output=True, min_node_count = 1, timeout_in_minutes = 10)

Found existing cluster, use it.


## Dataset
In this project, we use a [loan prediction problem dataset](https://www.kaggle.com/altruistdelhite04/loan-prediction-problem-dataset) from Kaggle.
The dataset contains 11 features and the target column **Loan_Status**.

The dataset will be retrieved in the **train.py**.

## Hyperdrive Configuration

In this project, we use the [Scikit-learn Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) as a classification algorithm.

We specify two hyperparameters, one is the inverse of regularization strength(**C**) and another is the maximum number of iterations to converge(**max_iter**).

In terms of parameter sampling, we use [Random Parameter Sampling](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.randomparametersampling?view=azure-ml-py). The random sampling supports early termination of low performance runs, therefore, we can save time for training and cost for computing resource and this is good especially for the initial search. 
This time, the choice of 6 parameters for C, and the choice of 4 parameters are applied.

Regarding an early termination policy, we use [Bandit Policy](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.banditpolicy?view=azure-ml-py). 
This policy ends runs when the primary metric isn't withing the specified slack factor/amount of the most successful run.

In Hyperdrive configuration, we specify **Accuracy** as a primary metric which is the same as AutoML project and the primary metric goal is **PrimaryMetricGoal.MAXIMIZE** to maximize the primary metric.

We also specify the following two parameters to limit iterations.
- **max_total_runs**: 1000 (The maximum total number of runs to create)
- **max_concurrent_runs**: 10 (The maximum number of runs to execute concurrently)

Reference:
[HyperDriveConfig Class](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.hyperdriveconfig?view=azure-ml-py)

In [5]:
# TODO: Create an early termination policy. This is not required if you are using Bayesian sampling.
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#early-termination
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.banditpolicy?view=azure-ml-py#definition
early_termination_policy = BanditPolicy(evaluation_interval=100, delay_evaluation=200, slack_factor=0.2)

#TODO: Create the different params that you will be using during training
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#random-sampling
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.randomparametersampling?view=azure-ml-py
# https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
# https://towardsdatascience.com/dont-sweat-the-solver-stuff-aea7cddc3451
# https://www.kaggle.com/joparga3/2-tuning-parameters-for-logistic-regression
#param_sampling = RandomParameterSampling({
#    "C": choice(0.1, 0.5, 1, 1.5, 2.5, 5, 7.5, 10),
#    "max_iter": choice(10, 20, 30, 40, 50, 60, 70, 80, 90, 100)
#})
param_sampling = RandomParameterSampling({
    "C": choice(0.001, 0.01, 0.1, 1, 10, 100),
    "max_iter": choice(25,50,75,100)
})

#if "training" not in os.listdir():
#    os.mkdir("./training")

# Create a SKLearn estimator for use with train.py
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py
sklearn_env = Environment.from_conda_specification(name='sklearn-env', file_path='./conda_dependencies.yml')
src = ScriptRunConfig(source_directory='.',
                     script='./train.py',
                     compute_target=compute_target,
                     environment=sklearn_env)

# Create a HyperDriveConfig using the estimator, hyperparameter sampler, and policy.
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.hyperdriveconfig?view=azure-ml-py
hd_config = HyperDriveConfig(run_config=src,
                                    hyperparameter_sampling=param_sampling,
                                    policy=early_termination_policy,
                                    primary_metric_name='Accuracy',
                                    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                    max_total_runs=1000,
                                    max_concurrent_runs=10)

In [7]:
#TODO: Submit your experiment
hd_run = experiment.submit(config=hd_config)

## Run Details

In the cell below, use the `RunDetails` widget to show the different experiments.

In [8]:
RunDetails(hd_run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

In [9]:
hd_run.wait_for_completion(show_output=True)

RunId: HD_99bc19ac-9339-4e14-b251-51dc1dcac56b
Web View: https://ml.azure.com/runs/HD_99bc19ac-9339-4e14-b251-51dc1dcac56b?wsid=/subscriptions/f9d5a085-54dc-4215-9ba6-dad5d86e60a0/resourcegroups/aml-quickstarts-144884/workspaces/quick-starts-ws-144884&tid=660b3398-b80e-49d2-bc5b-ac1dc93b5254

Streaming azureml-logs/hyperdrive.txt

"<START>[2021-05-15T16:03:19.392481][API][INFO]Experiment created<END>\n""<START>[2021-05-15T16:03:19.921107][GENERATOR][INFO]Trying to sample '10' jobs from the hyperparameter space<END>\n""<START>[2021-05-15T16:03:20.120447][GENERATOR][INFO]Successfully sampled '10' jobs, they will soon be submitted to the execution target.<END>\n"

Execution Summary
RunId: HD_99bc19ac-9339-4e14-b251-51dc1dcac56b
Web View: https://ml.azure.com/runs/HD_99bc19ac-9339-4e14-b251-51dc1dcac56b?wsid=/subscriptions/f9d5a085-54dc-4215-9ba6-dad5d86e60a0/resourcegroups/aml-quickstarts-144884/workspaces/quick-starts-ws-144884&tid=660b3398-b80e-49d2-bc5b-ac1dc93b5254



{'runId': 'HD_99bc19ac-9339-4e14-b251-51dc1dcac56b',
 'target': 'cpu-cluster-h',
 'status': 'Completed',
 'startTimeUtc': '2021-05-15T16:03:19.140683Z',
 'endTimeUtc': '2021-05-15T16:16:53.775863Z',
 'properties': {'primary_metric_config': '{"name": "Accuracy", "goal": "maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': '21cbaf6f-607e-4c35-b47c-0e21a9118381',
  'score': '0.8333333333333334',
  'best_child_run_id': 'HD_99bc19ac-9339-4e14-b251-51dc1dcac56b_11',
  'best_metric_status': 'Succeeded'},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/hyperdrive.txt': 'https://mlstrg144884.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_99bc19ac-9339-4e14-b251-51dc1dcac56b/azureml-logs/hyperdrive.txt?sv=2019-02-02&sr=b&sig=BuPp6hWg2B%2FdI%2Bz5XJNoiJY4bYqzC4BMFCGDJvNWLkc%3D&st=2021-05-15T16%3A06%3A59Z&se=2021-05-16T00%3A16%3A59Z&sp=r'},
 'submittedBy': 'ODL_User 144884

## Best Model

In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

In [10]:
import joblib
# Get your best run and save the model from that run.
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#find-the-best-model
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn?view=azure-ml-py#save-and-register-the-model

best_run = hd_run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
arguments = best_run.get_details()['runDefinition']['arguments']
print('Best Run Id: ', best_run.id)
print('Accuracy: ', best_run_metrics['Accuracy'])
print('C: ', arguments[1])
print('max_iter: ', arguments[3])

Best Run Id:  HD_99bc19ac-9339-4e14-b251-51dc1dcac56b_11
Accuracy:  0.8333333333333334
C:  0.1
max_iter:  100


### Register the best model

In [6]:
# Register the best model
model = best_run.register_model(model_name='loan-prediction-hd-model',
                               model_path='./outputs/model.joblib',
                               tags={'Method':'Hyperdrive'},
                                description='Hyperdrive Model trained on loan prediction data to predict a loan status of customers',
                               properties={'Accuracy':best_run_metrics['Accuracy']})

NameError: name 'best_run' is not defined

In [12]:
# Save the model in the local project folder
best_run.download_file('outputs/model.joblib', project_folder + '/outputs/model.joblib')

### Prepare sample datasets for Model registration

In [28]:
# Try to load the dataset from the Workspace. Otherwise, create it from the file
# NOTE: update the key to match the dataset name
found = False
key = "raw-loan-prediction-dataset"
description_text = "Loan prediction dataset before cleaning"

if key in ws.datasets.keys(): 
        found = True
        raw_dataset = ws.datasets[key] 

if not found:
        # Create AML Dataset and register it into Workspace
        example_data = 'https://raw.githubusercontent.com/fnakashima/nd00333-capstone/master/starter_file/dataset/train_u6lujuX_CVtuZ9i.csv'
        raw_dataset = Dataset.Tabular.from_delimited_files(example_data)        
        #Register Dataset in Workspace
        raw_dataset = raw_dataset.register(workspace=ws,
                                   name=key,
                                   description=description_text)

df = raw_dataset.to_pandas_dataframe()
df

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,False,0,Graduate,False,5849,0.0,,360.0,1.0,Urban,True
1,LP001003,Male,True,1,Graduate,False,4583,1508.0,128.0,360.0,1.0,Rural,False
2,LP001005,Male,True,0,Graduate,True,3000,0.0,66.0,360.0,1.0,Urban,True
3,LP001006,Male,True,0,Not Graduate,False,2583,2358.0,120.0,360.0,1.0,Urban,True
4,LP001008,Male,False,0,Graduate,False,6000,0.0,141.0,360.0,1.0,Urban,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...
609,LP002978,Female,False,0,Graduate,False,2900,0.0,71.0,360.0,1.0,Rural,True
610,LP002979,Male,True,3+,Graduate,False,4106,0.0,40.0,180.0,1.0,Rural,True
611,LP002983,Male,True,1,Graduate,False,8072,240.0,253.0,360.0,1.0,Urban,True
612,LP002984,Male,True,2,Graduate,False,7583,0.0,187.0,360.0,1.0,Urban,True


In [29]:
def clean_data(data):
    # Dict for cleaning data
    dependents = {"0":0, "1":1, "2":2, "3+":3}
    property_areas = {"Urban":1, "Semiurban":2, "Rural":3}

    # Clean and one hot encode data
    x_df = data.dropna()
    x_df.drop("Loan_ID", axis=1, inplace=True)

    # Filtering "True", "Yes", "Y" won't work as it will be recoginised as a boolean value automatically by dataset framework
    x_df.loc[:,('Gender')] = x_df.Gender.apply(lambda s: 1 if s == "Male" else 2)
    x_df.loc[:,('Married')] = x_df.Married.apply(lambda s: 1 if s else 0)
    x_df.loc[:,('Dependents')] = x_df.Dependents.map(dependents)
    x_df.loc[:,('Education')] = x_df.Education.apply(lambda s: 1 if s == "Graduate" else 0)
    x_df.loc[:,('Self_Employed')] = x_df.Self_Employed.apply(lambda s: 1 if s else 0)
    x_df.loc[:,('Property_Area')] = x_df.Property_Area.map(property_areas)

    y_df = x_df.pop("Loan_Status").apply(lambda s: 1 if s else 0)
    return x_df, y_df

In [30]:
from sklearn.model_selection import train_test_split
import pandas as pd

df
x, y = clean_data(df)
x


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Unnamed: 0,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area
1,1,1,1,1,0,4583,1508.0,128.0,360.0,1.0,3
2,1,1,0,1,1,3000,0.0,66.0,360.0,1.0,1
3,1,1,0,0,0,2583,2358.0,120.0,360.0,1.0,1
4,1,0,0,1,0,6000,0.0,141.0,360.0,1.0,1
5,1,1,2,1,1,5417,4196.0,267.0,360.0,1.0,1
...,...,...,...,...,...,...,...,...,...,...,...
609,2,0,0,1,0,2900,0.0,71.0,360.0,1.0,3
610,1,1,3,1,0,4106,0.0,40.0,180.0,1.0,3
611,1,1,1,1,0,8072,240.0,253.0,360.0,1.0,1
612,1,1,2,1,0,7583,0.0,187.0,360.0,1.0,1


In [31]:
# Split data into train and test sets.
# https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
# Default test_size: 0.25
x_train, x_test, y_train, y_test = train_test_split(x, y)

In [33]:
# https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
import numpy as np

from azureml.core import Dataset

np.savetxt('features.csv', x_test, delimiter=',')
np.savetxt('labels.csv', y_test, delimiter=',')

datastore = ws.get_default_datastore()
datastore.upload_files(files=['./features.csv', './labels.csv'],
                       target_path='loan-prediction-h-project/',
                       overwrite=True)

input_dataset = Dataset.Tabular.from_delimited_files(path=[(datastore, 'loan-prediction-h-project/features.csv')])
output_dataset = Dataset.Tabular.from_delimited_files(path=[(datastore, 'loan-prediction-h-project/labels.csv')])

Uploading an estimated of 2 files
Uploading ./features.csv
Uploaded ./features.csv, 1 files out of an estimated total of 2
Uploading ./labels.csv
Uploaded ./labels.csv, 2 files out of an estimated total of 2
Uploaded 2 files


## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

In the cell below, register the model and deploy the model as a web service.

In [5]:
from azureml.core import Model
model = Model(ws, 'loan-prediction-hd-model')

In [14]:
with open(project_folder + '/score.py') as f:
    print(f.read())

import json
import logging
import os
import pickle
import numpy as np
import pandas as pd
import joblib

import azureml.automl.core
from azureml.automl.core.shared import logging_utilities, log_server
from azureml.telemetry import INSTRUMENTATION_KEY

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType

try:
    log_server.enable_telemetry(INSTRUMENTATION_KEY)
    log_server.set_verbosity('INFO')
    #logger = logging.getLogger(__name__)
    logger = logging.getLogger('azureml.automl.core.scoring_script')
except:
    pass

# The init() method is called once, when the web service starts up.
#
# Typically you would deserialize the model file, as shown here using joblib,
# and store it in a global variable so your run() method can access it later.
def init():
    global model

    # The AZUREML_MO

In [None]:
sklearn_env = Environment.from_conda_specification(name='sklearn-env', file_path='./conda_dependencies.yml')

In [6]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
import sklearn


environment = Environment('my-sklearn-environment')
environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[
    'azureml-defaults',
    'inference-schema[numpy-support]',
    'joblib',
    'numpy',
    'scikit-learn=={}'.format(sklearn.__version__)
])

In [15]:
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
from azureml.core import Model

service_name = 'loan-prediction-hd-service3'

inference_config = InferenceConfig(entry_script=project_folder + '/score.py', environment=environment)
aci_config = AciWebservice.deploy_configuration(cpu_cores=1,
                                                memory_gb=1,
                                                enable_app_insights=True,
                                                description="Loan status prediction service")

service = Model.deploy(workspace=ws,
                       name=service_name,
                       models=[model],
                       inference_config=inference_config,
                       deployment_config=aci_config,
                       overwrite=True)
service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-05-15 18:42:45+00:00 Creating Container Registry if not exists.
2021-05-15 18:42:46+00:00 Registering the environment.
2021-05-15 18:42:47+00:00 Use the existing image.
2021-05-15 18:42:47+00:00 Generating deployment configuration.
2021-05-15 18:42:48+00:00 Submitting deployment to compute.
2021-05-15 18:42:50+00:00 Checking the status of deployment loan-prediction-hd-service3..
2021-05-15 18:45:33+00:00 Checking the status of inference endpoint loan-prediction-hd-service3.
Failed


Service deployment polling reached non-successful terminal state, current service state: Failed
Operation ID: 35f12206-f7d3-4fab-a713-1b21f1e40268
More information can be found using '.get_logs()'
Error:
{
  "code": "AciDeploymentFailed",
  "statusCode": 400,
  "message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
	1. Please check the logs for your container instance: loan-prediction-hd-service3. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
	2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
	3. You can also try to run image 3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227 locally. Please refer to https://aka.ms/debugimage#servi

WebserviceException: WebserviceException:
	Message: Service deployment polling reached non-successful terminal state, current service state: Failed
Operation ID: 35f12206-f7d3-4fab-a713-1b21f1e40268
More information can be found using '.get_logs()'
Error:
{
  "code": "AciDeploymentFailed",
  "statusCode": 400,
  "message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
	1. Please check the logs for your container instance: loan-prediction-hd-service3. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
	2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
	3. You can also try to run image 3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.",
  "details": [
    {
      "code": "CrashLoopBackOff",
      "message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.
	1. Please check the logs for your container instance: loan-prediction-hd-service3. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
	2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
	3. You can also try to run image 3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."
    },
    {
      "code": "AciDeploymentFailed",
      "message": "Your container application crashed. Please follow the steps to debug:
	1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.
	2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.
	3. You can also interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.
	4. View the diagnostic events to check status of container, it may help you to debug the issue.
"RestartCount": 3
"CurrentState": {"state":"Waiting","startTime":null,"exitCode":null,"finishTime":null,"detailStatus":"CrashLoopBackOff: Back-off restarting failed"}
"PreviousState": {"state":"Terminated","startTime":"2021-05-15T18:46:54.17Z","exitCode":111,"finishTime":"2021-05-15T18:46:59.794Z","detailStatus":"Error"}
"Events":
{"count":1,"firstTimestamp":"2021-05-15T18:43:01Z","lastTimestamp":"2021-05-15T18:43:01Z","name":"Pulling","message":"pulling image "3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227@sha256:301311c8354761d43cab182892134da9c8d533f3c47d0c84aa42cbc9149007ff"","type":"Normal"}
{"count":1,"firstTimestamp":"2021-05-15T18:44:48Z","lastTimestamp":"2021-05-15T18:44:48Z","name":"Pulled","message":"Successfully pulled image "3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227@sha256:301311c8354761d43cab182892134da9c8d533f3c47d0c84aa42cbc9149007ff"","type":"Normal"}
{"count":4,"firstTimestamp":"2021-05-15T18:45:21Z","lastTimestamp":"2021-05-15T18:46:54Z","name":"Started","message":"Started container","type":"Normal"}
{"count":4,"firstTimestamp":"2021-05-15T18:45:26Z","lastTimestamp":"2021-05-15T18:46:59Z","name":"Killing","message":"Killing container with id 9b8e465259e228ba82936858c7cebfdeabbd94b00206b28eb226afa090406245.","type":"Normal"}
"
    }
  ]
}
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Service deployment polling reached non-successful terminal state, current service state: Failed\nOperation ID: 35f12206-f7d3-4fab-a713-1b21f1e40268\nMore information can be found using '.get_logs()'\nError:\n{\n  \"code\": \"AciDeploymentFailed\",\n  \"statusCode\": 400,\n  \"message\": \"Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.\n\t1. Please check the logs for your container instance: loan-prediction-hd-service3. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.\n\t2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t3. You can also try to run image 3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.\",\n  \"details\": [\n    {\n      \"code\": \"CrashLoopBackOff\",\n      \"message\": \"Your container application crashed. This may be caused by errors in your scoring file's init() function.\n\t1. Please check the logs for your container instance: loan-prediction-hd-service3. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.\n\t2. You can interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t3. You can also try to run image 3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227 locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.\"\n    },\n    {\n      \"code\": \"AciDeploymentFailed\",\n      \"message\": \"Your container application crashed. Please follow the steps to debug:\n\t1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. Please refer to https://aka.ms/debugimage#dockerlog for more information.\n\t2. If your container application crashed. This may be caused by errors in your scoring file's init() function. You can try debugging locally first. Please refer to https://aka.ms/debugimage#debug-locally for more information.\n\t3. You can also interactively debug your scoring file locally. Please refer to https://docs.microsoft.com/azure/machine-learning/how-to-debug-visual-studio-code#debug-and-troubleshoot-deployments for more information.\n\t4. View the diagnostic events to check status of container, it may help you to debug the issue.\n\"RestartCount\": 3\n\"CurrentState\": {\"state\":\"Waiting\",\"startTime\":null,\"exitCode\":null,\"finishTime\":null,\"detailStatus\":\"CrashLoopBackOff: Back-off restarting failed\"}\n\"PreviousState\": {\"state\":\"Terminated\",\"startTime\":\"2021-05-15T18:46:54.17Z\",\"exitCode\":111,\"finishTime\":\"2021-05-15T18:46:59.794Z\",\"detailStatus\":\"Error\"}\n\"Events\":\n{\"count\":1,\"firstTimestamp\":\"2021-05-15T18:43:01Z\",\"lastTimestamp\":\"2021-05-15T18:43:01Z\",\"name\":\"Pulling\",\"message\":\"pulling image \"3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227@sha256:301311c8354761d43cab182892134da9c8d533f3c47d0c84aa42cbc9149007ff\"\",\"type\":\"Normal\"}\n{\"count\":1,\"firstTimestamp\":\"2021-05-15T18:44:48Z\",\"lastTimestamp\":\"2021-05-15T18:44:48Z\",\"name\":\"Pulled\",\"message\":\"Successfully pulled image \"3c9f512a451e46feb66962f4f34f6621.azurecr.io/azureml/azureml_73804c8efe1927e444374f2dcb6e4227@sha256:301311c8354761d43cab182892134da9c8d533f3c47d0c84aa42cbc9149007ff\"\",\"type\":\"Normal\"}\n{\"count\":4,\"firstTimestamp\":\"2021-05-15T18:45:21Z\",\"lastTimestamp\":\"2021-05-15T18:46:54Z\",\"name\":\"Started\",\"message\":\"Started container\",\"type\":\"Normal\"}\n{\"count\":4,\"firstTimestamp\":\"2021-05-15T18:45:26Z\",\"lastTimestamp\":\"2021-05-15T18:46:59Z\",\"name\":\"Killing\",\"message\":\"Killing container with id 9b8e465259e228ba82936858c7cebfdeabbd94b00206b28eb226afa090406245.\",\"type\":\"Normal\"}\n\"\n    }\n  ]\n}"
    }
}

In [13]:
print(ws.webservices)

# Choose the webservice you are interested in

from azureml.core import Webservice

service = Webservice(ws, service_name)
print(service.get_logs())

{'loan-prediction-hd-service2': AciWebservice(workspace=Workspace.create(name='quick-starts-ws-144884', subscription_id='f9d5a085-54dc-4215-9ba6-dad5d86e60a0', resource_group='aml-quickstarts-144884'), name=loan-prediction-hd-service2, image_id=None, compute_type=None, state=ACI, scoring_uri=None, tags=None, properties={}, created_by={'hasInferenceSchema': 'False', 'hasHttps': 'False'}), 'loan-prediction-hd-service1': AciWebservice(workspace=Workspace.create(name='quick-starts-ws-144884', subscription_id='f9d5a085-54dc-4215-9ba6-dad5d86e60a0', resource_group='aml-quickstarts-144884'), name=loan-prediction-hd-service1, image_id=None, compute_type=None, state=ACI, scoring_uri=None, tags=None, properties=None, created_by={'hasInferenceSchema': 'False', 'hasHttps': 'False'}), 'loan-prediction-hd-service': AciWebservice(workspace=Workspace.create(name='quick-starts-ws-144884', subscription_id='f9d5a085-54dc-4215-9ba6-dad5d86e60a0', resource_group='aml-quickstarts-144884'), name=loan-predict

In [24]:
import json
import logging
import os
import pickle
import numpy as np
import pandas as pd
import joblib

# The init() method is called once, when the web service starts up.
#
# Typically you would deserialize the model file, as shown here using joblib,
# and store it in a global variable so your run() method can access it later.
target_model_filename = 'outputs/model.joblib'
target_model_path = os.path.join(os.environ['AZUREML_MODEL_DIR'], target_model_filename)
print(target_model_path)
loaded_model = joblib.load(target_model_path)
print(loaded_model)

KeyError: 'AZUREML_MODEL_DIR'

In [22]:
#import logging
#logging.basicConfig(level=logging.DEBUG)
print(Model.get_model_path(model_name='loan-prediction-hd-model'))

DEBUG:azureml.core.run:Could not load run context RunEnvironmentException:
	Message: Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run.
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run."
    }
}, switching offline: False
DEBUG:azureml.core.run:Could not load the run context and allow_offline set to False
DEBUG:azureml.core.model:RunEnvironmentException: RunEnvironmentException:
	Message: Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run.
	InnerException RunEnvironmentException:
	Message: Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run.
	InnerException None
	ErrorResponse 
{
    "error": {
   

ModelNotFoundException: ModelNotFoundException:
	Message: Model loan-prediction-hd-model not found in cache at azureml-models or in current working directory /mnt/batch/tasks/shared/LS_root/mounts/clusters/jupyter-cpmpute-f/code/Users/odl_user_144884. For more info, set logging level to DEBUG.
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Model loan-prediction-hd-model not found in cache at azureml-models or in current working directory /mnt/batch/tasks/shared/LS_root/mounts/clusters/jupyter-cpmpute-f/code/Users/odl_user_144884. For more info, set logging level to DEBUG."
    }
}

In [34]:
# https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
import sklearn

from azureml.core import Model
from azureml.core.resource_configuration import ResourceConfiguration


model = Model.register(workspace=ws,
                       model_name='loan-prediction-hd-model',                # Name of the registered model in your workspace.
                       model_path='./loan-prediction-h-project/outputs/model.joblib',  # Local file to upload and register as a model.
                       model_framework=Model.Framework.SCIKITLEARN,  # Framework used to create the model.
                       model_framework_version=sklearn.__version__,  # Version of scikit-learn used to create the model.
                       sample_input_dataset=input_dataset,
                       sample_output_dataset=output_dataset,
                       resource_configuration=ResourceConfiguration(cpu=1, memory_in_gb=0.5),
                       description='Hyperdrive Model trained on loan prediction data to predict a loan status of customers')

print('Name:', model.name)
print('Version:', model.version)

Registering model loan-prediction-hd-model
Name: loan-prediction-hd-model
Version: 2


In [35]:
from azureml.core import Model
service_name = 'loan-prediction-service'

service = Model.deploy(ws, service_name, [model], overwrite=True)
service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-05-15 12:35:48+00:00 Creating Container Registry if not exists.
2021-05-15 12:35:48+00:00 Registering the environment.
2021-05-15 12:35:49+00:00 Uploading autogenerated assets for no-code-deployment.
2021-05-15 12:35:52+00:00 Building image..
2021-05-15 12:40:21+00:00 Generating deployment configuration.
2021-05-15 12:40:22+00:00 Submitting deployment to compute..
2021-05-15 12:40:36+00:00 Checking the status of deployment loan-prediction-service..
2021-05-15 12:42:19+00:00 Checking the status of inference endpoint loan-prediction-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


### Turn on ApplicationInsights for logging

In [36]:
service.update(enable_app_insights=True)

### Test the API by using test dataset
In the cell below, send a request to the web service you deployed to test it.

### Prepara test data

In [38]:
x_test[0:5].values

array([[1.000e+00, 1.000e+00, 3.000e+00, 1.000e+00, 0.000e+00, 3.430e+03,
        1.250e+03, 1.280e+02, 3.600e+02, 0.000e+00, 2.000e+00],
       [1.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 1.000e+00, 5.800e+03,
        0.000e+00, 1.320e+02, 3.600e+02, 1.000e+00, 2.000e+00],
       [1.000e+00, 0.000e+00, 0.000e+00, 1.000e+00, 0.000e+00, 2.935e+03,
        0.000e+00, 9.800e+01, 3.600e+02, 1.000e+00, 2.000e+00],
       [1.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 2.927e+03,
        2.405e+03, 1.110e+02, 3.600e+02, 1.000e+00, 2.000e+00],
       [1.000e+00, 1.000e+00, 2.000e+00, 1.000e+00, 0.000e+00, 3.200e+03,
        7.000e+02, 7.000e+01, 3.600e+02, 1.000e+00, 1.000e+00]])

### Run the service with test data

In [39]:
import json


input_payload = json.dumps({
    'data': x_test[0:5].values.tolist(),
    'method': 'predict'  # If you have a classification model, you can get probabilities by changing this to 'predict_proba'.
})

output = service.run(input_payload)

print(output)

{'predict': [0, 1, 1, 1, 1]}


### Check the result

In [40]:
y_test[0:5].values

array([0, 1, 1, 1, 1])

In the cell below, print the logs of the web service and delete the service

In [41]:
# Show the logs of the web service
print(service.get_logs())

2021-05-15T12:42:10,239973500+00:00 - iot-server/run 
2021-05-15T12:42:10,239158600+00:00 - gunicorn/run 
File not found: /var/azureml-app/.
Starting HTTP server
2021-05-15T12:42:10,339944500+00:00 - rsyslog/run 
2021-05-15T12:42:10,430002500+00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2021-05-15T12:42:12,640725400+00:00 - iot-server/finish 1 0
2021-05-15T12:42:12,720681800+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 19.9.0
Listening at: http://127.0.0.1:31311 (62)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 89
SPARK_HOME not set. Skipping PySpark Initialization.
Initializing logger
2021-05-15 12:42:24,349 | root | INFO | Starting up app insights client
logging socket was found. logging is available.
logging socket was found. logging is available.
2021-05-15 12:42:24,419 | root | INFO | Starting up request id generator
2021-05-15 12:42:24,419 | root | INFO | Starting up app in

In [43]:
# Delete the service
service.delete()