# Hyperparameter Tuning using HyperDrive

Import all the dependencies that you will need to complete the project.

In [1]:
from azureml.core import Workspace, Experiment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

In [2]:
currDir=os.getcwd()
print(currDir)
os.listdir(currDir)

/mnt/batch/tasks/shared/LS_root/mounts/clusters/notebook138754/code/Users


['.ipynb_checkpoints',
 'automl.ipynb',
 'hyperparameter_tuning.ipynb',
 'train.py']

## Dataset

Getting data. Writing code to access the data used in this project. The dataset is external.

### Connect to a workspace

In [3]:
ws = Workspace.from_config()

print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Workspace name: quick-starts-ws-138754
Azure region: southcentralus
Subscription id: a0a76bad-11a1-4a2d-9887-97a29122c8ed
Resource group: aml-quickstarts-138754


### Create an Azure ML experiment

In [4]:
# choose a name for experiment
experiment_name = 'hdr_heart_failure_experiment'
project_folder = './hyperdrive-model'
experiment=Experiment(ws, experiment_name)
experiment
run = experiment.start_logging()

### Create or Attach a Compute Resource

In [5]:
# Create compute cluster
# Use vm_size = "STANDARD_D3_V2" in provisioning configuration.
# max_nodes 5.

# Choose a name for CPU cluster
cluster_name = "my-cpu-cluster"

# Check if the compute target exists
try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target, use it')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='Standard_D3_V2', 
                                                           max_nodes=5)
    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

compute_target.wait_for_completion(show_output=True)

# get a detailed status for the current cluster
print(compute_target.get_status().serialize())



Creating a new compute target...
Creating
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2021-02-13T00:28:26.894000+00:00', 'errors': None, 'creationTime': '2021-02-13T00:28:24.556267+00:00', 'modifiedTime': '2021-02-13T00:28:39.925629+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 5, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D3_V2'}


## Hyperdrive Configuration

TODO: Explain the model you are using and the reason for chosing the different hyperparameters, termination policy and config settings.

We use RandomParameterSampling method over the hyperparameter search space to randomly select values for C (choice among discrete values 0.01, 1.0, 3.0) and max_iter (choice among discrete values 50, 150, 200) hyperparameters. We used a limited number of parameters to make the experiment complete faster.Random sampling supports both discrete and continuous hyperparameters and allows us to refine the search space to improve results.

We also use BanditPolicy which defines an early termination policy based on slack_factor=0.1 and evaluation_interval=2. The slack_factor is the ratio used to calculate the allowed distance from the best performing experiment run. The evaluation_interval is the frequency for applying the policy.

In [6]:
from azureml.core import ScriptRunConfig
from azureml.core.environment import Environment
from azureml.widgets import RunDetails

from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import uniform, choice
import os
import shutil


# Create an early termination policy. We are using Random Parameter Sampling.
early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

# Create the different params that you will be using during training
param_sampling = RandomParameterSampling({
        '--C': choice(0.01, 1.0, 3.0),
        '--max_iter': choice(50, 100, 150)
    }
)

script_dir = "./training"
if "training" not in os.listdir():
    os.mkdir(script_dir)
    
shutil.copy('train.py', script_dir)


# Create a SKLearn estimator for use with train.py
estimator = SKLearn(source_directory=script_dir, entry_script='train.py', compute_target=compute_target)

# Create a HyperDriveConfig using the estimator, hyperparameter sampler, and policy.
hyperdrive_run_config = HyperDriveConfig(estimator=estimator, 
                             hyperparameter_sampling=param_sampling,
                             policy=early_termination_policy,
                             primary_metric_name='Accuracy', 
                             primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, 
                             max_total_runs=24,
                             max_concurrent_runs=4)

'SKLearn' estimator is deprecated. Please use 'ScriptRunConfig' from 'azureml.core.script_run_config' with your own defined environment or the AzureML-Tutorial curated environment.


In [7]:
# Submit experiment

hyperdrive_run = experiment.submit(config=hyperdrive_run_config, show_output = True)



## Run Details


Using the `RunDetails` widget to show the different experiments.

In [8]:
RunDetails(hyperdrive_run).show()
hyperdrive_run.wait_for_completion(show_output=True)

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

RunId: HD_c54db12a-daac-4291-8e96-382b79119b65
Web View: https://ml.azure.com/experiments/hdr_heart_failure_experiment/runs/HD_c54db12a-daac-4291-8e96-382b79119b65?wsid=/subscriptions/a0a76bad-11a1-4a2d-9887-97a29122c8ed/resourcegroups/aml-quickstarts-138754/workspaces/quick-starts-ws-138754

Streaming azureml-logs/hyperdrive.txt

"<START>[2021-02-13T00:29:01.386467][API][INFO]Experiment created<END>\n""<START>[2021-02-13T00:29:02.376288][GENERATOR][INFO]Trying to sample '4' jobs from the hyperparameter space<END>\n"<START>[2021-02-13T00:29:02.5223064Z][SCHEDULER][INFO]The execution environment is being prepared. Please be patient as it can take a few minutes.<END>"<START>[2021-02-13T00:29:02.703293][GENERATOR][INFO]Successfully sampled '4' jobs, they will soon be submitted to the execution target.<END>\n"

Execution Summary
RunId: HD_c54db12a-daac-4291-8e96-382b79119b65
Web View: https://ml.azure.com/experiments/hdr_heart_failure_experiment/runs/HD_c54db12a-daac-4291-8e96-382b79119b65

{'runId': 'HD_c54db12a-daac-4291-8e96-382b79119b65',
 'target': 'my-cpu-cluster',
 'status': 'Completed',
 'startTimeUtc': '2021-02-13T00:29:01.150112Z',
 'endTimeUtc': '2021-02-13T00:38:19.051224Z',
 'properties': {'primary_metric_config': '{"name": "Accuracy", "goal": "maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': '822a42e6-add9-46ac-8db6-2cd7b3ff46a9',
  'score': '0.7833333333333333',
  'best_child_run_id': 'HD_c54db12a-daac-4291-8e96-382b79119b65_3',
  'best_metric_status': 'Succeeded'},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/hyperdrive.txt': 'https://mlstrg138754.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_c54db12a-daac-4291-8e96-382b79119b65/azureml-logs/hyperdrive.txt?sv=2019-02-02&sr=b&sig=o5sOTtpu%2BXzrcZKGhNT6cJ%2BIzud3%2BwCZ%2F%2BpDv5bg1IY%3D&st=2021-02-13T00%3A28%3A26Z&se=2021-02-13T08%3A38%3A26Z&sp=r'},
 'submittedBy': 'ODL_User 

In [9]:
hyperdrive_run.get_status()

'Completed'

In [10]:
hyperdrive_run

Experiment,Id,Type,Status,Details Page,Docs Page
hdr_heart_failure_experiment,HD_c54db12a-daac-4291-8e96-382b79119b65,hyperdrive,Completed,Link to Azure Machine Learning studio,Link to Documentation


## Best Model

Get the best model from the hyperdrive experiments and display all the properties of the model.

In [11]:
# Get your best run 

best_run = hyperdrive_run.get_best_run_by_primary_metric()
best_run_metrics=best_run.get_metrics()
best_run_details = best_run.get_details() 
parameter_values = best_run.get_details()['runDefinition']['arguments']
best_run_files=best_run.get_file_names()

print('Best Run ID',best_run.id)
print('\n Metrics: ', best_run_metrics)
print('\n Parameters: ', parameter_values,sep='\n')
print('\nAccuracy of Best run',best_run_metrics['Accuracy'],sep='\n')
print('\nBest run file names',best_run_files,sep='\n')

Best Run ID HD_c54db12a-daac-4291-8e96-382b79119b65_3

 Metrics:  {'Regularization Strength:': 3.0, 'Max iterations:': 150, 'Accuracy': 0.7833333333333333}

 Parameters: 
['--C', '3', '--max_iter', '150']

Accuracy of Best run
0.7833333333333333

Best run file names
['azureml-logs/55_azureml-execution-tvmps_6372a8b0833dcb54ae3af8463d95688bc0546f8cc08c492a4b9bee6bac55b502_d.txt', 'azureml-logs/65_job_prep-tvmps_6372a8b0833dcb54ae3af8463d95688bc0546f8cc08c492a4b9bee6bac55b502_d.txt', 'azureml-logs/70_driver_log.txt', 'azureml-logs/75_job_post-tvmps_6372a8b0833dcb54ae3af8463d95688bc0546f8cc08c492a4b9bee6bac55b502_d.txt', 'azureml-logs/process_info.json', 'azureml-logs/process_status.json', 'logs/azureml/105_azureml.log', 'logs/azureml/job_prep_azureml.log', 'logs/azureml/job_release_azureml.log', 'outputs/hyperdrive_model.joblib']


In [12]:
import joblib

# Register the best model
model = best_run.register_model(model_name='best_hyperdrive_model', model_path='outputs/hyperdrive_model.joblib')
print(model)



# Save the best model
best_run.download_file('/outputs/hyperdrive_model.joblib', 'outputs/hyperdrive_model.joblib')

Model(workspace=Workspace.create(name='quick-starts-ws-138754', subscription_id='a0a76bad-11a1-4a2d-9887-97a29122c8ed', resource_group='aml-quickstarts-138754'), name=best_hyperdrive_model, id=best_hyperdrive_model:1, version=1, tags={}, properties={})


## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

TODO: In the cell below, send a request to the web service you deployed to test it.

TODO: In the cell below, print the logs of the web service and delete the service