# Tuning Hyperparameters

There are many machine learning algorithms that require *hyperparameters* (parameter values that influence training, but can't be determined from the training data itself). For example, when training a logistic regression model, you can use a *regularization rate* hyperparameter to counteract bias in the model; or when training a convolutional neural network, you can use hyperparameters like *learning rate* and *batch size* to control how weights are adjusted and how many data items are processed in a mini-batch respectively. The choice of hyperparameter values can significantly affect the performance of a trained model, or the time taken to train it; and often you need to try multiple combinations to find the optimal solution.

In this project, you'll use a logistic regression model with a three hyperparameter, but the principles apply to any kind of model you can train with Azure Machine Learning.

## Connect to Your Workspace

The first thing you need to do is to connect to your workspace using the Azure ML SDK.

> **Note**: If the authenticated session with your Azure subscription has expired since you completed the previous exercise, you'll be prompted to reauthenticate.

In [1]:
import azureml.core
from azureml.core import Workspace
import os
from azureml.core import Experiment
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive import BanditPolicy, HyperDriveConfig, PrimaryMetricGoal,uniform, choice,RandomParameterSampling
from azureml.widgets import RunDetails
from azureml.core import Run, ScriptRunConfig

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.19.0


In [2]:
# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

Ready to use Azure ML 1.19.0 to work with quick-starts-ws-136014
quick-starts-ws-136014
aml-quickstarts-136014
southcentralus
3e42d11f-d64d-4173-af9b-12ecaa1030b3


## Prepare Data for an Experiment

In this lab, I'll use a dataset containing details of heart_failure_clinical_records_dataset. Run the cell below to create this dataset (if you created it in the previous lab, the code will create a new version)

In [3]:
from azureml.core import Dataset

# Try to load the dataset from the Workspace. Otherwise, create it from the file
# NOTE: update the key to match the dataset name
found = False
key = "Heartfailure Dataset"
description_text = "Heart failure DataSet for Kaggle or archive.ics.uci.edu machine-learning"

if key in ws.datasets.keys(): 
        found = True
        dataset = ws.datasets[key] 

if not found:
        # Create AML Dataset and register it into Workspace
        heartfailure_data = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv'
        dataset = Dataset.Tabular.from_delimited_files(heartfailure_data)        
        #Register Dataset in Workspace
        dataset = dataset.register(workspace=ws,
                                   name=key,
                                   description=description_text)


df = dataset.to_pandas_dataframe()
df.describe()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
count,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0
mean,60.833893,0.431438,581.839465,0.41806,38.083612,0.351171,263358.029264,1.39388,136.625418,0.648829,0.32107,130.26087,0.32107
std,11.894809,0.496107,970.287881,0.494067,11.834841,0.478136,97804.236869,1.03451,4.412477,0.478136,0.46767,77.614208,0.46767
min,40.0,0.0,23.0,0.0,14.0,0.0,25100.0,0.5,113.0,0.0,0.0,4.0,0.0
25%,51.0,0.0,116.5,0.0,30.0,0.0,212500.0,0.9,134.0,0.0,0.0,73.0,0.0
50%,60.0,0.0,250.0,0.0,38.0,0.0,262000.0,1.1,137.0,1.0,0.0,115.0,0.0
75%,70.0,1.0,582.0,1.0,45.0,1.0,303500.0,1.4,140.0,1.0,1.0,203.0,1.0
max,95.0,1.0,7861.0,1.0,80.0,1.0,850000.0,9.4,148.0,1.0,1.0,285.0,1.0


## Prepare a Training Script

Let's start by creating a folder for the training script you'll use to train a logistic regression model.

In [5]:
experiment_folder = 'hyperdrive'
os.makedirs(experiment_folder, exist_ok=True)

print('Folder ready.')

Folder ready.


Now create the Python script to train the model. This must include:

- A parameter for each hyperparameter you want to optimize (in this case, there's only the regularization hyperparameter)
- Code to log the performance metric you want to optimize for (in this case, you'll log both AUC and accuracy, so you can choose to optimize the model for either of these)

In [6]:
%%writefile $experiment_folder/heart_training.py

# Import libraries
from sklearn.linear_model import LogisticRegression
import argparse
import os
import numpy as np
from sklearn.metrics import mean_squared_error
import joblib
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
from azureml.core.run import Run
from azureml.data.dataset_factory import TabularDatasetFactory
from azureml.core import Dataset
from azureml.data.datapath import DataPath
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve

# Set regularization parameter
parser = argparse.ArgumentParser()

parser.add_argument('--C', type=float, default=1.0, help="Inverse of regularization strength. Smaller values cause stronger regularization")
parser.add_argument('--max_iter', type=int, default=100, help="Maximum number of iterations to converge")
parser.add_argument('--regularization', type=float, dest='reg_rate', default=0.01, help='regularization rate')
args = parser.parse_args()
reg = args.reg_rate

# Get the experiment run context
run = Run.get_context()

# load the heart_failure_clinical_records_dataset
ds= TabularDatasetFactory.from_delimited_files(path="https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv")
def clean_data(data):
    
    x_df = data.to_pandas_dataframe().dropna()
    y_df = x_df.pop("DEATH_EVENT")
    return x_df, y_df

x, y = clean_data(ds)

# Separate features and labels
#X, y = ds['age','anaemia','creatinine_phosphokinase','diabetes','ejection_fraction','high_blood_pressure','platelets',
  #               'serum_creatinine','serum_sodium','sex','smoking','time'].values, ds['DEATH_EVENT'].values

# Split data into training set and test set
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=0)

# Train a logistic regression model
print('Training a logistic regression model with regularization rate of', reg)
run.log('Regularization Rate',  np.float(reg))
run.log("Regularization Strength:", np.float(args.C))
run.log("Max iterations:", np.int(args.max_iter))
model = LogisticRegression(C=args.C,max_iter=args.max_iter, solver="liblinear").fit(x_train, y_train)

# calculate accuracy
y_hat = model.predict(x_test)
acc = np.average(y_hat == y_test)
print('Accuracy:', acc)
run.log('Accuracy', np.float(acc))

# calculate AUC
y_scores = model.predict_proba(x_test)
auc = roc_auc_score(y_test,y_scores[:,1])
print('AUC: ' + str(auc))
run.log('AUC', np.float(auc))

os.makedirs('outputs', exist_ok=True)
# note file saved in the outputs folder is automatically uploaded into experiment record
joblib.dump(value=model, filename='outputs/best_run_hd.pkl')

run.complete()


Writing hyperdrive/heart_training.py


## Prepare a Compute Target

One of the benefits of cloud compute is that it scales on-demand, enabling you to provision enough compute resources to process multiple runs of an experiment in parallel, each with different hyperparameter values.

You'll use the **aml-cluster** Azure Machine Learning compute cluster you created in an earlier lab (if it doesn't exist, it will be created).

> **Important**: Change *your-compute-cluster* to the name of your compute cluster in the code below before running it!

In [7]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
# TODO: Create compute cluster
# max_nodes should be no greater than 4.

# choose a name for your cluster
cluster_name = "project-compute"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS3_V2', 
                                                           max_nodes=4)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

# can poll for a minimum number of nodes and for a specific timeout. 
# if no min node count is provided it uses the scale settings for the cluster
compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=10)
    
 # use get_status() to get a detailed status for the current cluster. 
print(compute_target.get_status().serialize())

Found existing compute target
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 1, 'targetNodeCount': 1, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 1, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2021-01-25T18:08:55.781000+00:00', 'errors': None, 'creationTime': '2021-01-25T17:20:54.580024+00:00', 'modifiedTime': '2021-01-25T17:21:10.614943+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_DS3_V2'}


## Run a Hyperdrive Experiment

Azure Machine Learning includes a hyperparameter tuning capability through *Hyperdrive* experiments. These experiments launch multiple child runs, each with a different hyperparameter combination. The run producing the best model (as determined by the logged target performance metric for which you want to optimize) can be identified, and its trained model selected for registration and deployment.

In [8]:
from azureml.core.environment import Environment
# Sample a range of parameter values
params = RandomParameterSampling({
    "--C" : uniform(0.1,1),
    "--max_iter" : choice(50,100,150,200),
    "--regularization": choice(0.001, 0.005, 0.01, 0.05, 0.1, 1.0)
    }
)
# Specify a Policy
policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

# Get the training dataset
heartfailure_ds = ws.datasets.get("Heartfailure dataset")
sklearn_env = Environment.get(workspace=ws, name="AzureML-Tutorial")
# Create an estimator that uses the remote compute
hyper_estimator = SKLearn(source_directory=experiment_folder,
                          inputs=[heartfailure_ds.as_named_input('heartfailure')], # Pass the dataset as an input...
                          pip_packages=['azureml-sdk'], # ...so we need azureml-dataprep (it's in the SDK!)
                          entry_script='heart_training.py',
                          compute_target = compute_target,)

# Configure hyperdrive settings
hyperdrive = HyperDriveConfig(estimator=hyper_estimator, 
                          hyperparameter_sampling=params, 
                          policy=policy, 
                          primary_metric_name='AUC', 
                          primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, 
                          max_total_runs=6,
                          max_concurrent_runs=4)

# Run the experiment
experiment = Experiment(workspace = ws, name = 'heartfailuretraining_hyperdrive')
run = experiment.submit(config=hyperdrive)

# Show the status in the notebook as the experiment runs
RunDetails(run).show()
run.wait_for_completion()

'SKLearn' estimator is deprecated. Please use 'ScriptRunConfig' from 'azureml.core.script_run_config' with your own defined environment or the AzureML-Tutorial curated environment.


_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

{'runId': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388',
 'target': 'project-compute',
 'status': 'Completed',
 'startTimeUtc': '2021-01-25T18:17:33.792125Z',
 'endTimeUtc': '2021-01-25T18:30:26.159464Z',
 'properties': {'primary_metric_config': '{"name": "AUC", "goal": "maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': '99852d51-e54d-494f-abe4-d4050cb5f6b7',
  'score': '0.8237327188940091',
  'best_child_run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_0',
  'best_metric_status': 'Succeeded'},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/hyperdrive.txt': 'https://mlstrg136014.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_93501b78-a2ca-4df3-9184-1a1ba57dd388/azureml-logs/hyperdrive.txt?sv=2019-02-02&sr=b&sig=62%2BA8ZjA9p8HIjgVUJo%2Fafd2mcsKusEPWdjKSDmRCeE%3D&st=2021-01-25T18%3A20%3A59Z&se=2021-01-26T02%3A30%3A59Z&sp=r'}}

You can view the experiment run status in the widget above. You can also view the main Hyperdrive experiment run and its child runs in [Azure Machine Learning studio](https://ml.azure.com).

> **Note**: The widget may not refresh. You'll see summary information displayed below the widget when the run has completed.

## Determine the Best Performing Run

When all of the runs have finished, you can find the best one based on the performance metric you specified (in this case, the one with the best AUC).

In [9]:
for child_run in run.get_children_sorted_by_primary_metric():
    print(child_run)

best_run = run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
parameter_values = best_run.get_details() ['runDefinition']['arguments']

print('Best Run Id: ', best_run.id)
print(' -AUC:', best_run_metrics['AUC'])
print(' -Accuracy:', best_run_metrics['Accuracy'])
print(' -Regularization Rate:',parameter_values)

{'run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_4', 'hyperparameters': '{"--C": 0.8362407576509557, "--max_iter": 200, "--regularization": 0.01}', 'best_primary_metric': 0.8237327188940091, 'status': 'Completed'}
{'run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_5', 'hyperparameters': '{"--C": 0.38840604031390524, "--max_iter": 100, "--regularization": 0.05}', 'best_primary_metric': 0.8237327188940091, 'status': 'Completed'}
{'run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_2', 'hyperparameters': '{"--C": 0.7949422999618715, "--max_iter": 150, "--regularization": 0.05}', 'best_primary_metric': 0.8237327188940091, 'status': 'Completed'}
{'run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_1', 'hyperparameters': '{"--C": 0.7882363738431387, "--max_iter": 50, "--regularization": 0.1}', 'best_primary_metric': 0.8237327188940091, 'status': 'Completed'}
{'run_id': 'HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_0', 'hyperparameters': '{"--C": 0.9939177083701627, "--max_iter": 150, "--regular

Now that you've found the best run, you can register the model it trained.

In [10]:
# Get your best run and save the model from that run.
best_run = run.get_best_run_by_primary_metric()
print(best_run)
best_run_metrics = best_run.get_metrics()
for metric_name in best_run_metrics:
    metric = best_run_metrics[metric_name]
    print(metric_name, metric) 

Run(Experiment: heartfailuretraining_hyperdrive,
Id: HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_0,
Type: azureml.scriptrun,
Status: Completed)
Regularization Rate 0.05
Regularization Strength: 0.9939177083701627
Max iterations: 150
Accuracy 0.8222222222222222
AUC 0.8237327188940091


In [11]:
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
heartfailuretraining_hyperdrive,HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_0,azureml.scriptrun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [12]:
best_run = run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()

print('Best Run Id: ', best_run.id)
print('\n Accuracy:', best_run_metrics['Accuracy'])

Best Run Id:  HD_93501b78-a2ca-4df3-9184-1a1ba57dd388_0

 Accuracy: 0.8222222222222222


In [13]:
# let's list the model files uploaded during the run
print(best_run.get_file_names())

['azureml-logs/55_azureml-execution-tvmps_9ec7ae94923cb5683d2e01c935512aaced5e9dd20192b73877effaeb99a15062_d.txt', 'azureml-logs/65_job_prep-tvmps_9ec7ae94923cb5683d2e01c935512aaced5e9dd20192b73877effaeb99a15062_d.txt', 'azureml-logs/70_driver_log.txt', 'azureml-logs/75_job_post-tvmps_9ec7ae94923cb5683d2e01c935512aaced5e9dd20192b73877effaeb99a15062_d.txt', 'azureml-logs/process_info.json', 'azureml-logs/process_status.json', 'logs/azureml/105_azureml.log', 'logs/azureml/dataprep/backgroundProcess.log', 'logs/azureml/dataprep/backgroundProcess_Telemetry.log', 'logs/azureml/job_prep_azureml.log', 'logs/azureml/job_release_azureml.log', 'outputs/best_run_hd.pkl']


In [14]:
from azureml.core import Model
import joblib
# Register model
best_run.download_file("outputs/best_run_hd.pkl","./outputs/best_run_hd.pkl")
model=best_run.register_model(model_name='model',model_path='outputs/best_run_hd.pkl',tags={'Training context':'Hyperdrive'},
                        properties={'Accuracy': best_run_metrics['Accuracy']})


In [15]:
model

Model(workspace=Workspace.create(name='quick-starts-ws-136014', subscription_id='3e42d11f-d64d-4173-af9b-12ecaa1030b3', resource_group='aml-quickstarts-136014'), name=model, id=model:1, version=1, tags={'Training context': 'Hyperdrive'}, properties={'Accuracy': '0.8222222222222222'})

> **More Information**: For more information about Hyperdrive, see the [Azure ML documentation](https://docs.microsoft.com/azure/machine-learning/how-to-tune-hyperparameters).

In [22]:
from azureml.core import Model

for model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')


model version: 1
	 Training context : Hyperdrive
	 Accuracy : 0.8222222222222222


Automl-Heartfailure-Model version: 1


