# Hyperparameter Tuning using HyperDrive

In [1]:
from azureml.core import Workspace, Experiment, Model
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.parameter_expressions import uniform, choice
from azureml.widgets import RunDetails
import requests
import json
from azureml.core.webservice import AciWebservice, LocalWebservice

## Workspace setup

First, we setup our workspace to work with azure.

In [2]:
ws = Workspace.from_config()
experiment_name = 'hyperdrive'

experiment = Experiment(ws, experiment_name)

## Dataset

The dataset used is the [UCI Glass Identification](https://archive.ics.uci.edu/ml/datasets/Glass+Identification) dataset. All data importing and treating is done by the [train.py](https://github.com/reis-r/nd00333-capstone/blob/master/train.py) script. This will be the script used by our Hyperdrive run. The objective will be to classify the glass type according to it's composition and other characteristics. This dataset was chosen because it will not take too much time for cleaning, and it's a very known dataset for experimenting with machine learning.

## Create a compute cluster

In [3]:
cluster_name = "hyperdrive"
# Check if a compute cluster already exists
try:
    print("Trying to connect to an existing cluster...")
    compute_cluster = ComputeTarget(workspace=ws, name=cluster_name)
except ComputeTargetException:
    print("Creating a compute cluster...")
    compute_configuration = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    compute_cluster = ComputeTarget.create(ws, cluster_name, compute_configuration)
    compute_cluster.wait_for_completion(show_output=True)
print("Success!")

Trying to connect to an existing cluster...
Creating a compute cluster...
Creating
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
Success!


## Hyperdrive Configuration

For the Hyperdrive configuration, BanditPolicy was choosen for the termination policy, it terminates when the accuracy of a run is not within the slack amount compared to the best performing run. It's a less conservative policy that might prove sufficient for this experiment.

The algorithm choosen is the SVC, it is a good classification algorithm based on support-vector machine.

The Parameter Sampler is the Random Sampler, this method is faster, but may not provide the best possible results. The regularization parameter (penalty) was configured with uniform sampling, which gives a value uniformly distributed between the minimum and maximum possible values. It's the most basic and safe parameter sampling method for continuous variables.

The choice for the kernel will be random from every value supported by scikit-learn.

In [4]:
# Create an early termination policy
early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

# Create the different params that will be used during training
param_sampling = RandomParameterSampling({
    "--kernel": choice(['linear', 'poly', 'rbf', 'sigmoid', 'precomputed']),
    "--C": uniform(0.1, 1.0)
    })

# Create estimator and hyperdrive config
estimator = est = SKLearn(source_directory="./",
                          entry_script="train.py",
                          compute_target="hyperdrive")

hyperdrive_run_config = HyperDriveConfig(estimator=estimator,
                                         hyperparameter_sampling=param_sampling, 
                                         primary_metric_name='Accuracy',
                                         primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                         max_total_runs=10,
                                         policy=early_termination_policy,
                                         max_concurrent_runs=2)

'SKLearn' estimator is deprecated. Please use 'ScriptRunConfig' from 'azureml.core.script_run_config' with your own defined environment or the AzureML-Tutorial curated environment.


In [5]:
# Submit the experiment
run = experiment.submit(hyperdrive_run_config)



## Run Details

In [6]:
RunDetails(run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

We obtained an accuracy of 100%, while this usually means the model over-learned the trainning dataset, it should take for account that is a simple, classic Machine Learning problem. This may prove that using such advanced featureas Hyperdrive was probably an overkill.

## Best Model

In [8]:
best_run = run.get_best_run_by_primary_metric()
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
hyperdrive,HD_9545ef39-47e2-4d3f-a0be-213aec4cc469_1,azureml.scriptrun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [9]:
print(best_run.get_details()['runDefinition']['arguments'])
print('Run properties:')
print(best_run.get_properties())
print('Best metrics:')
print(best_run.get_metrics())

['--C', '0.7119893367586747', '--kernel', 'linear']
Run properties:
{'_azureml.ComputeTargetType': 'amlcompute', 'ContentSnapshotId': '6d735651-4d2c-4341-99ee-1587096986fa', 'ProcessInfoFile': 'azureml-logs/process_info.json', 'ProcessStatusFile': 'azureml-logs/process_status.json'}
Best metrics:
{'Kernel type': 'linear', 'Regularization parameter': 0.7119893367586747, 'Accuracy': 1.0}


In [10]:
if "outputs" not in os.listdir():
    os.mkdir("./outputs")

# Save the best model
pickle_filename = "outputs/model.joblib"
best_run.download_file(pickle_filename, output_file_path="outputs/hiperdrive-model.pkl")
print("Best model saved.")

Best model saved.


## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [51]:
# Register the best model
model = best_run.register_model(model_name='glass-prediction', model_path='outputs/model.joblib')

In [52]:
# Setup environment
env = best_run.get_environment()

In [53]:
# Define an inference configuration
inference_config = InferenceConfig(entry_script='score.py',
                                    environment=env)

# Create deployment configuration
deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1, auth_enabled=True, enable_app_insights=True)

# Deploy model
deployment_config = LocalWebservice.deploy_configuration(port=8890)
service = Model.deploy(ws, "glass-prediction", [model], inference_config, deployment_config)
service.wait_for_deployment(show_output = True)
print(service.state)

Downloading model glass-prediction:3 to /tmp/azureml_ri1z2_5s/glass-prediction/3
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry f37ef909c23c4f20a0aefd752cd96caf.azurecr.io
Logging into Docker registry f37ef909c23c4f20a0aefd752cd96caf.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM f37ef909c23c4f20a0aefd752cd96caf.azurecr.io/azureml/azureml_2a0e99d2b6c0b56ef3ce1012e5647b1d
 ---> 0b1ee8524b7f
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 68b450da82c5
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjFiOTQ0YTliLWZkYWUtNGY5Ny1hZWIxLWI3ZWVhMGJlYWM1MyIsInJlc291cmNlR3JvdXBOYW1lIjoiYW1sLXF1aWNrc3RhcnRzLTEzNjMxOCIsImFjY291bnROYW1lIjoicXVpY2stc3RhcnRzLXdzLTEzNjMxOCIsIndvcmtzcGFjZUlkIjoiZjM3ZWY5MDktYzIzYy00ZjIwLWEwYWUtZmQ3NTJjZDk2Y2FmIn0sIm1vZGVscyI6e30sIm1vZGVsc0luZm8iOnt9fQ== | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in ba3c0cbc65b2
 ---> a69281adc

ERROR:azureml._model_management._util:Error: Container has crashed. Did your init method fail?




Container Logs:
2021-01-28T02:26:39,952558770+00:00 - rsyslog/run 
2021-01-28T02:26:39,954543793+00:00 - iot-server/run 
2021-01-28T02:26:39,958405838+00:00 - gunicorn/run 
2021-01-28T02:26:39,967890049+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_ba9520bf386d662001eeb9523395794e/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_ba9520bf386d662001eeb9523395794e/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_ba9520bf386d662001eeb9523395794e/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_ba9520bf386d662001eeb9523395794e/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_ba9520bf386d662001eeb9523395794e/lib/libssl.so.1.0.0: no version information available (required by /usr/sbi

WebserviceException: WebserviceException:
	Message: Error: Container has crashed. Did your init method fail?
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Error: Container has crashed. Did your init method fail?"
    }
}

TODO: In the cell below, send a request to the web service you deployed to test it.

TODO: In the cell below, print the logs of the web service and delete the service