# Hyperparameter Tuning using HyperDrive

Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
import azureml.core
from azureml.core import Workspace, Experiment
from azureml.widgets import RunDetails
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import normal, uniform, choice
from azureml.core import Environment
from azureml.core import ScriptRunConfig
import os

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.27.0


## Initialize Workspace

In [2]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

quick-starts-ws-144966
aml-quickstarts-144966
southcentralus
a0a76bad-11a1-4a2d-9887-97a29122c8ed


## Create an Azure ML experiment

In [3]:
experiment_name = 'loan-prediction-h'
project_folder = './loan-prediction-h-project'

experiment=Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
loan-prediction-h,quick-starts-ws-144966,Link to Azure Machine Learning studio,Link to Documentation


## Create a compute cluster

In [5]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.compute_target import ComputeTargetException

# NOTE: update the cluster name to match the existing cluster
# Choose a name for your CPU cluster
amlcompute_cluster_name = "cpu-cluster-h"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='Standard_DS12_v2',# for GPU, use "STANDARD_NC6"
                                                           #vm_priority = 'lowpriority', # optional
                                                           max_nodes=10)
    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)
    compute_target.wait_for_completion(show_output=True, min_node_count = 1, timeout_in_minutes = 10)

Found existing cluster, use it.


## Dataset
In this project, we use a [loan prediction problem dataset](https://www.kaggle.com/altruistdelhite04/loan-prediction-problem-dataset) from Kaggle.
The dataset contains 11 features and the target column **Loan_Status**.

The dataset will be retrieved in the **train.py**.

## Hyperdrive Configuration

In this project, we use the [Scikit-learn Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) as a classification algorithm.

We specify two hyperparameters, one is the inverse of regularization strength(**C**) and another is the maximum number of iterations to converge(**max_iter**).

In terms of parameter sampling, we use [Random Parameter Sampling](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.randomparametersampling?view=azure-ml-py). The random sampling supports early termination of low performance runs, therefore, we can save time for training and cost for computing resource and this is good especially for the initial search. 
This time, the choice of 6 parameters for C, and the choice of 8 parameters are applied.

Regarding an early termination policy, we use [Bandit Policy](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.banditpolicy?view=azure-ml-py). 
This policy ends runs when the primary metric isn't withing the specified slack factor/amount of the most successful run.

In Hyperdrive configuration, we specify **Accuracy** as a primary metric which is the same as AutoML project and the primary metric goal is **PrimaryMetricGoal.MAXIMIZE** to maximize the primary metric.

We also specify the following two parameters to limit iterations.
- **max_total_runs**: 1000 (The maximum total number of runs to create)
- **max_concurrent_runs**: 10 (The maximum number of runs to execute concurrently)

Reference:
[HyperDriveConfig Class](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.hyperdriveconfig?view=azure-ml-py)

In [11]:
# TODO: Create an early termination policy. This is not required if you are using Bayesian sampling.
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#early-termination
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.banditpolicy?view=azure-ml-py#definition
early_termination_policy = BanditPolicy(evaluation_interval=100, delay_evaluation=200, slack_factor=0.2)

#TODO: Create the different params that you will be using during training
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#random-sampling
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.randomparametersampling?view=azure-ml-py
# https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
# https://towardsdatascience.com/dont-sweat-the-solver-stuff-aea7cddc3451
# https://www.kaggle.com/joparga3/2-tuning-parameters-for-logistic-regression
#param_sampling = RandomParameterSampling({
#    "C": choice(0.1, 0.5, 1, 1.5, 2.5, 5, 7.5, 10),
#    "max_iter": choice(10, 20, 30, 40, 50, 60, 70, 80, 90, 100)
#})
param_sampling = RandomParameterSampling({
    "C": choice(0.01, 0.1, 1, 5, 7.5, 10),
    "max_iter": choice(10, 25, 50, 60, 70, 80, 90, 100)
})

#if "training" not in os.listdir():
#    os.mkdir("./training")

# Create a SKLearn estimator for use with train.py
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py
sklearn_env = Environment.from_conda_specification(name='sklearn-env', file_path='./conda_dependencies.yml')
src = ScriptRunConfig(source_directory='.',
                     script='./train.py',
                     compute_target=compute_target,
                     environment=sklearn_env)

# Create a HyperDriveConfig using the estimator, hyperparameter sampler, and policy.
# https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.hyperdrive.hyperdriveconfig?view=azure-ml-py
hd_config = HyperDriveConfig(run_config=src,
                                    hyperparameter_sampling=param_sampling,
                                    policy=early_termination_policy,
                                    primary_metric_name='Accuracy',
                                    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                    max_total_runs=1000,
                                    max_concurrent_runs=10)

In [12]:
#TODO: Submit your experiment
hd_run = experiment.submit(config=hd_config)

## Run Details

In the cell below, use the `RunDetails` widget to show the different experiments.

In [13]:
RunDetails(hd_run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

In [14]:
hd_run.wait_for_completion(show_output=True)

RunId: HD_be774c70-5155-40dd-afa8-5672ac2e6b5d
Web View: https://ml.azure.com/runs/HD_be774c70-5155-40dd-afa8-5672ac2e6b5d?wsid=/subscriptions/a0a76bad-11a1-4a2d-9887-97a29122c8ed/resourcegroups/aml-quickstarts-144966/workspaces/quick-starts-ws-144966&tid=660b3398-b80e-49d2-bc5b-ac1dc93b5254

Streaming azureml-logs/hyperdrive.txt

"<START>[2021-05-16T14:42:12.713274][API][INFO]Experiment created<END>\n""<START>[2021-05-16T14:42:13.259838][GENERATOR][INFO]Trying to sample '10' jobs from the hyperparameter space<END>\n""<START>[2021-05-16T14:42:13.471561][GENERATOR][INFO]Successfully sampled '10' jobs, they will soon be submitted to the execution target.<END>\n"<START>[2021-05-16T14:42:43.0588710Z][SCHEDULER][INFO]Scheduling job, id='HD_be774c70-5155-40dd-afa8-5672ac2e6b5d_0'<END><START>[2021-05-16T14:42:43.0709583Z][SCHEDULER][INFO]Scheduling job, id='HD_be774c70-5155-40dd-afa8-5672ac2e6b5d_7'<END><START>[2021-05-16T14:42:43.0703708Z][SCHEDULER][INFO]Scheduling job, id='HD_be774c70-5155

{'runId': 'HD_be774c70-5155-40dd-afa8-5672ac2e6b5d',
 'target': 'cpu-cluster-h',
 'status': 'Completed',
 'startTimeUtc': '2021-05-16T14:42:12.480626Z',
 'endTimeUtc': '2021-05-16T14:55:23.241066Z',
 'properties': {'primary_metric_config': '{"name": "Accuracy", "goal": "maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': '20df194b-5e3b-46bc-afa4-488fdeeb421a',
  'score': '0.8916666666666667',
  'best_child_run_id': 'HD_be774c70-5155-40dd-afa8-5672ac2e6b5d_33',
  'best_metric_status': 'Succeeded'},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/hyperdrive.txt': 'https://mlstrg144966.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_be774c70-5155-40dd-afa8-5672ac2e6b5d/azureml-logs/hyperdrive.txt?sv=2019-02-02&sr=b&sig=x4o%2BNQanq8CzZQ%2FWYMtrzhxJaJ4O9rW5PONahZvWCAo%3D&st=2021-05-16T14%3A45%3A35Z&se=2021-05-16T22%3A55%3A35Z&sp=r'},
 'submittedBy': 'ODL_User 144966

## Best Model

In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

In [15]:
import joblib
# Get your best run and save the model from that run.
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#find-the-best-model
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn?view=azure-ml-py#save-and-register-the-model

best_run = hd_run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
arguments = best_run.get_details()['runDefinition']['arguments']
print('Best Run Id: ', best_run.id)
print('Accuracy: ', best_run_metrics['Accuracy'])
print('C: ', arguments[1])
print('max_iter: ', arguments[3])

Best Run Id:  HD_be774c70-5155-40dd-afa8-5672ac2e6b5d_33
Accuracy:  0.8916666666666667
C:  10
max_iter:  70


### Register the best model

In [16]:
# Register the best model
model = best_run.register_model(model_name='loan-prediction-hd-model',
                               model_path='./outputs/model.joblib',
                               tags={'Method':'Hyperdrive'},
                                description='Hyperdrive Model trained on loan prediction data to predict a loan status of customers',
                               properties={'Accuracy':best_run_metrics['Accuracy']})

In [17]:
# Save the model in the local project folder
best_run.download_file('outputs/model.joblib', project_folder + '/outputs/model.joblib')

## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

In the cell below, register the model, create an inference config and deploy the model as a web service.

In [5]:
# This is needed only when resuming the project
from azureml.core import Model
model = Model(ws, 'loan-prediction-hd-model')

#### Prepare a service environment

In [21]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
import sklearn


service_env = Environment('my-sklearn-environment')
service_env_dependencies = CondaDependencies.create(pip_packages=[
    'azureml-defaults',
    'inference-schema[numpy-support]',
    'joblib',
    'numpy',
    'scikit-learn=={}'.format(sklearn.__version__)
])
service_env.python.conda_dependencies = service_env_dependencies

In [22]:
# Save the environment definition to the local project folder
with open(project_folder + '/service_env.yml', "w") as f:
    f.write(service_env_dependencies.serialize_to_string())

#### Deploy the best model

In [23]:
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
from azureml.core import Model

service_name = 'loan-prediction-hd-service'

inference_config = InferenceConfig(entry_script=project_folder + '/score.py', environment=service_env)
aci_config = AciWebservice.deploy_configuration(cpu_cores=1,
                                                memory_gb=1,
                                                description="Loan status prediction service")

service = Model.deploy(workspace=ws,
                       name=service_name,
                       models=[model],
                       inference_config=inference_config,
                       deployment_config=aci_config,
                       overwrite=True)
service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-05-16 15:28:25+00:00 Creating Container Registry if not exists.
2021-05-16 15:28:26+00:00 Registering the environment.
2021-05-16 15:28:27+00:00 Building image..
2021-05-16 15:34:10+00:00 Generating deployment configuration.
2021-05-16 15:34:11+00:00 Submitting deployment to compute.
2021-05-16 15:34:14+00:00 Checking the status of deployment loan-prediction-hd-service..
2021-05-16 15:36:28+00:00 Checking the status of inference endpoint loan-prediction-hd-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


### _Standout Suggestions - Enable logging in your deployed web app_

In [24]:
# Enable ApplicationInsights for logging in the endpoint
service.update(enable_app_insights=True)

### Test the API by using test dataset
In the cell below, send a request to the web service you deployed to test it.

#### Prepara test data

In [25]:
from azureml.data.dataset_factory import TabularDatasetFactory
from train import clean_data #import from local file
from sklearn.model_selection import train_test_split
import pandas as pd

web_path = ['https://raw.githubusercontent.com/fnakashima/nd00333-capstone/master/starter_file/dataset/train_u6lujuX_CVtuZ9i.csv']
ds = TabularDatasetFactory.from_delimited_files(path=web_path)

# Clean up data
x, y = clean_data(ds)

# Split data into train and test sets.
# https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
# Default test_size: 0.25
x_train, x_test, y_train, y_test = train_test_split(x, y)


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


#### Run the service with test data

In [32]:
import json

input_payload = json.dumps({
    'data': x_test[0:3].values.tolist()
})

output = service.run(input_payload)

print(output)

{"result": [1, 1, 1]}


#### Check the result

In [33]:
y_test[0:3].values

array([1, 1, 1])

In the cell below, print the logs of the web service and delete the service

In [34]:
# Show the logs of the web service
print(service.get_logs())

2021-05-16T15:45:48,739529400+00:00 - iot-server/run 
2021-05-16T15:45:48,738694900+00:00 - gunicorn/run 
2021-05-16T15:45:48,755710600+00:00 - rsyslog/run 
2021-05-16T15:45:48,788362000+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_17d14353fe479000f7ea8024c63d2036/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_17d14353fe479000f7ea8024c63d2036/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_17d14353fe479000f7ea8024c63d2036/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_17d14353fe479000f7ea8024c63d2036/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_17d14353fe479000f7ea8024c63d2036/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
EdgeHubC

In [43]:
# Delete the service
service.delete()