# Hyperparameter Tuning using HyperDrive

## Dependencies
Import Dependencies. In the cell below, import all the dependencies that are needed to complete the project.

In [1]:
from azureml.core import Environment, ScriptRunConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core.dataset import Dataset

from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.data.dataset_factory import TabularDatasetFactory

from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import choice, uniform

from azureml.widgets import RunDetails

from sklearn import datasets
from azureml.train.sklearn import SKLearn

import pandas as pd
import numpy as np
import os

## Initialization

In [2]:
ws = Workspace.from_config()
experiment_name = 'hyperdrive_experiment'

experiment=Experiment(ws, experiment_name)

run = experiment.start_logging()

## Dataset
The dataset for heart failure prediction was taken from kaggle and is saved locally. The code first tries to use that and if not reachable extract the data from the provided link.

The data includes several columns, namely:
`age`, `anaemia`, `creatinine_phosphokinase`, `diabetes`, `ejection_fraction`, `high_blood_pressure`, `platelets`, `serum_creainine`, `serum_sodium`, `sex`, `smoking`, `time` and `DEATH_EVENT`.

The patients `DEATH_EVENT` will be predicted based on the the other parameters listed in the dataset.

In [3]:
url = "https://raw.githubusercontent.com/mirsadraee/Udacity_ND_Azure_Machine_Learning_Projects/develop/Project_03/heart_failure_clinical_records_dataset.csv"

dataset = TabularDatasetFactory.from_delimited_files(url)

key = "heart-failure-prediction"
description_text = "Prediction of heart failure - NDAML"

dataset = dataset.register(workspace=ws,
                           name=key,
                           description=description_text)
dataset = dataset.to_pandas_dataframe()

dataset.head()

x = dataset.drop('DEATH_EVENT', axis=1)
y = dataset[['DEATH_EVENT']]

In [4]:
# Choose a name for your CPU cluster
cluster_name = "compute-cluster-p31"

# Create compute cluster
try:
    aml_compute = ComputeTarget(workspace=ws, name=cluster_name)
    print('An existing cluster will be used!')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    aml_compute = ComputeTarget.create(ws, cluster_name, compute_config)
    print('An new cluster will be used now!')

aml_compute.wait_for_completion(show_output=True)

print(aml_compute.get_status().serialize())


An new cluster will be used now!
InProgress..
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2023-04-23T16:35:07.527000+00:00', 'errors': None, 'creationTime': '2023-04-23T16:35:03.039282+00:00', 'modifiedTime': '2023-04-23T16:35:10.007769+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT1800S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D2_V2'}


## Hyperdrive Configuration
The code provided sets up a HyperDrive experiment for training a machine learning model using Azure Machine Learning. HyperDrive is a service that helps automate hyperparameter tuning, which involves finding the optimal set of hyperparameters for a model to achieve the best performance.

The different components of the code can be explained as follows:

The early termination policy is used to terminate poorly performing runs early in order to save time and resources. In this case, the slack factor is set to 0.1 and the delay evaluation is set to 2, which means that runs will be terminated if they are more than 10% worse than the best performing run, and this evaluation will occur after every 2 iterations.
The RandomParameterSampling object is used to randomly sample hyperparameters from a defined set of values. In this case, the hyperparameters being sampled are the regularization parameter 'C' and the maximum number of iterations 'max_iter'.
An estimator object is created using the SKLearn estimator class to specify the details of the training job, including the script source directory, the compute target to use for training, the virtual machine size to use, and the name of the entry script.
A HyperDriveConfig object is created to specify the configuration details of the HyperDrive experiment, including the estimator object, the hyperparameter sampling method, the early termination policy, the primary metric to optimize for (accuracy), the primary metric goal (maximizing accuracy), the maximum number of runs to execute (20), and the maximum number of concurrent runs (4).

In [5]:
# Create an early termination policy. This is not required if you are using Bayesian sampling.
early_termination_policy = BanditPolicy(slack_factor = 0.1, delay_evaluation = 2)

# Create the different params that you will be using during training
param_sampling = RandomParameterSampling(
    {
        '--C' : choice(0.01,0.1,1,10,100,500,1000),
        '--max_iter': choice(1,10,50,100,200)
    }
)

if "training" not in os.listdir():
    os.mkdir("./training")
    
# Setup environment for your training run
sklearn_env = Environment.from_conda_specification(name='sklearn-env', file_path='conda_dependencies.yml')

# Create a ScriptRunConfig Object to specify the configuration details of your training job
# est = ScriptRunConfig(source_directory='.',
#                      command=['python', 'train.py'],
#                      compute_target=aml_compute,
#                      environment=sklearn_env)

est = SKLearn(source_directory = "./",
            compute_target=aml_compute,
            vm_size='STANDARD_D2_V2',
            entry_script="train.py")

# Create a HyperDriveConfig using the src object, hyperparameter sampler, and policy.
hyperdrive_run_config = HyperDriveConfig(estimator=est,
                                     hyperparameter_sampling=param_sampling,
                                     policy=early_termination_policy,
                                     primary_metric_name='Accuracy',
                                     primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                     max_total_runs=20,
                                     max_concurrent_runs=4)

'SKLearn' estimator is deprecated. Please use 'ScriptRunConfig' from 'azureml.core.script_run_config' with your own defined environment or the AzureML-Tutorial curated environment.


In [6]:
# Submit your experiment
hyperdrive_run = experiment.submit(hyperdrive_run_config)



## Run Details
In the cell below, the `RunDetails` widget is used to show the different experiments.

In [7]:
RunDetails(hyperdrive_run).show()

hyperdrive_run.wait_for_completion(show_output=True)

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

RunId: HD_95ed1c1f-171b-4072-9935-99ef3d992da6
Web View: https://ml.azure.com/runs/HD_95ed1c1f-171b-4072-9935-99ef3d992da6?wsid=/subscriptions/976ee174-3882-4721-b90a-b5fef6b72f24/resourcegroups/aml-quickstarts-231777/workspaces/quick-starts-ws-231777&tid=660b3398-b80e-49d2-bc5b-ac1dc93b5254

Streaming azureml-logs/hyperdrive.txt

[2023-04-23T16:35:16.404890][GENERATOR][INFO]Trying to sample '4' jobs from the hyperparameter space
[2023-04-23T16:35:16.8486421Z][SCHEDULER][INFO]Scheduling job, id='HD_95ed1c1f-171b-4072-9935-99ef3d992da6_0' 
[2023-04-23T16:35:17.0553826Z][SCHEDULER][INFO]Scheduling job, id='HD_95ed1c1f-171b-4072-9935-99ef3d992da6_2' 
[2023-04-23T16:35:16.9361353Z][SCHEDULER][INFO]Scheduling job, id='HD_95ed1c1f-171b-4072-9935-99ef3d992da6_1' 
[2023-04-23T16:35:17.2010520Z][SCHEDULER][INFO]Scheduling job, id='HD_95ed1c1f-171b-4072-9935-99ef3d992da6_3' 
[2023-04-23T16:35:17.070015][GENERATOR][INFO]Successfully sampled '4' jobs, they will soon be submitted to the execution t

{'runId': 'HD_95ed1c1f-171b-4072-9935-99ef3d992da6',
 'target': 'compute-cluster-p31',
 'status': 'Completed',
 'startTimeUtc': '2023-04-23T16:35:15.452642Z',
 'endTimeUtc': '2023-04-23T16:49:20.074081Z',
 'services': {},
 'properties': {'primary_metric_config': '{"name":"Accuracy","goal":"maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': 'a059f284-1038-4578-8aaa-279505fe3ef9',
  'user_agent': 'python/3.8.5 (Linux-5.15.0-1035-azure-x86_64-with-glibc2.10) msrest/0.7.1 Hyperdrive.Service/1.0.0 Hyperdrive.SDK/core.1.49.0',
  'space_size': '35',
  'score': '0.8333333333333334',
  'best_child_run_id': 'HD_95ed1c1f-171b-4072-9935-99ef3d992da6_1',
  'best_metric_status': 'Succeeded',
  'best_data_container_id': 'dcid.HD_95ed1c1f-171b-4072-9935-99ef3d992da6_1'},
 'inputDatasets': [],
 'outputDatasets': [],
 'runDefinition': {'configuration': None,
  'attribution': None,
  'telemetryValues': {'am

## Best Model

In [8]:
import joblib
# Get your best run and save the model from that run.
best_run = hyperdrive_run.get_best_run_by_primary_metric()

best_run

Experiment,Id,Type,Status,Details Page,Docs Page
hyperdrive_experiment,HD_95ed1c1f-171b-4072-9935-99ef3d992da6_1,azureml.scriptrun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [9]:
best_run_metrics = best_run.get_metrics()
best_run_metrics 

{'Regularization Strength:': 1.0,
 'Max iterations:': 200,
 'Accuracy': 0.8333333333333334}

In [12]:
# Save the best model
best_run.register_model(model_name='Model_Hyperdive_best_run.pkl', model_path='./') 

# Download the model file
best_run.download_file('outputs/model.pkl', 'Model_Hyperdive_best_run.pkl')

**Submission Checklist**
- [X]  I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- [X]  I have taken a screenshot showing the model endpoint as active.
- [X]  The project includes a file containing the environment details.

