# Hyperparameter Tuning using HyperDrive

In [12]:
from azureml.core import Workspace, Experiment
#from azureml.core.environment import Environment
#from azureml.core.model import InferenceConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.parameter_expressions import uniform, choice
from azureml.widgets import RunDetails
#import requests
#import json
#from azureml.core.webservice import AciWebservice, LocalWebservice
#import sklearn

## Workspace setup

First, we setup our workspace to work with azure.

In [3]:
ws = Workspace.from_config()
experiment_name = 'hyperdrive'

experiment = Experiment(ws, experiment_name)

## Dataset

The dataset used is the [UCI Glass Identification](https://archive.ics.uci.edu/ml/datasets/Glass+Identification) dataset. All data importing and treating is done by the [train.py](https://github.com/reis-r/nd00333-capstone/blob/master/train.py) script. This will be the script used by our Hyperdrive run. The objective will be to classify the glass type according to it's composition and other characteristics. This dataset was chosen because it will not take too much time for cleaning, and it's a very known dataset for experimenting with machine learning.

## Create a compute cluster

In [7]:
cluster_name = "hyperdrive"
# Check if a compute cluster already exists
try:
    print("Trying to connect to an existing cluster...")
    compute_cluster = ComputeTarget(workspace=ws, name=cluster_name)
except ComputeTargetException:
    print("Creating a compute cluster...")
    compute_configuration = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    compute_cluster = ComputeTarget.create(ws, cluster_name, compute_configuration)
    compute_cluster.wait_for_completion(show_output=True)
print("Success!")

Trying to connect to an existing cluster...
Creating a compute cluster...
Creating
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
Success!


## Hyperdrive Configuration

For the Hyperdrive configuration, BanditPolicy was choosen for the termination policy, it terminates when the accuracy of a run is not within the slack amount compared to the best performing run. It's a less conservative policy that might prove sufficient for this experiment.

The algorithm choosen is the SVC, it is a good classification algorithm based on support-vector machine.

The Parameter Sampler is the Random Sampler, this method is faster, but may not provide the best possible results. The regularization parameter (penalty) was configured with uniform sampling, which gives a value uniformly distributed between the minimum and maximum possible values. It's the most basic and safe parameter sampling method for continuous variables.

The choice for the kernel will be random from every value supported by scikit-learn.

In [13]:
# Create an early termination policy
early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

# Create the different params that will be used during training
param_sampling = RandomParameterSampling({
    "--kernel": choice(['linear', 'poly', 'rbf', 'sigmoid', 'precomputed']),
    "--C": uniform(0.1, 1.0)
    })

# Create estimator and hyperdrive config
estimator = est = SKLearn(source_directory="./",
                          entry_script="train.py",
                          compute_target="hyperdrive")

hyperdrive_run_config = HyperDriveConfig(estimator=estimator,
                                         hyperparameter_sampling=param_sampling, 
                                         primary_metric_name='Accuracy',
                                         primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                         max_total_runs=10,
                                         policy=early_termination_policy,
                                         max_concurrent_runs=2)

'SKLearn' estimator is deprecated. Please use 'ScriptRunConfig' from 'azureml.core.script_run_config' with your own defined environment or the AzureML-Tutorial curated environment.


In [14]:
# Submit the experiment
run = experiment.submit(hyperdrive_run_config)



## Run Details

We can watch the development of the training using a widget:

In [15]:
RunDetails(run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

## Best Model

In [16]:
best_run = run.get_best_run_by_primary_metric()
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
hyperdrive,HD_dbdc5b56-74e2-452a-b1c5-40b9dfe76457_0,azureml.scriptrun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [17]:
print(best_run.get_details()['runDefinition']['arguments'])
print('Run properties:')
print(best_run.get_properties())
print('Best metrics:')
print(best_run.get_metrics())

['--C', '0.9644719191806422', '--kernel', 'linear']
Run properties:
{'_azureml.ComputeTargetType': 'amlcompute', 'ContentSnapshotId': '0f887c90-8327-488e-bf22-6f3691d2bdfe', 'ProcessInfoFile': 'azureml-logs/process_info.json', 'ProcessStatusFile': 'azureml-logs/process_status.json'}
Best metrics:
{'Kernel type': 'linear', 'Regularization parameter': 0.9644719191806422, 'Accuracy': 0.6511627906976745}


We obtained an accuracy of 65%. There were some problems with some runs, that might require a detailed view on what happened in the future. We now retrieve and save the best model:

In [18]:
if "outputs" not in os.listdir():
    os.mkdir("./outputs")

# Save the best model
pickle_filename = "outputs/model.joblib"
best_run.download_file(pickle_filename, output_file_path="outputs/model.joblib")
print("Best model saved.")

Best model saved.


In [19]:
# Delete the compute target for cost-saving
compute_cluster.delete()

Current provisioning state of AmlCompute is "Deleting"

