# Hyperparameter Tuning using HyperDrive

In the cell below, we import all the dependencies that we need to complete the project.

In [25]:
import joblib
import uuid
import requests
import json

from azureml.core import (
    Workspace,
    Experiment,
    Dataset,
    ComputeTarget,
    ScriptRunConfig,
    Environment
)

from azureml.train.hyperdrive import (
    BanditPolicy, 
    RandomParameterSampling,
    choice, 
    loguniform, 
    HyperDriveConfig, 
    PrimaryMetricGoal
)

from azureml.widgets import RunDetails
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

## Workspace

In [26]:
workspace = Workspace.from_config()

In [27]:
print("Subscription ID:", workspace.subscription_id)
print("Resource group:", workspace.resource_group)
print("Workspace name:", workspace.name)

Subscription ID: 510b94ba-e453-4417-988b-fbdc37b55ca7
Resource group: aml-quickstarts-239639
Workspace name: quick-starts-ws-239639


## Experiment

In [28]:
experiment_name = 'edu_hf_hyperdrive_exp'
experiment = Experiment(workspace, experiment_name)

## Compute target

We assume a compute cluster with the given name has already been created.

In [29]:
compute_cluster_name = "edu-compute-cluster"
compute_target = workspace.compute_targets[compute_cluster_name]

## Dataset

We use the [heart failure dataset](https://www.kaggle.com/datasets/andrewmvd/heart-failure-clinical-data) from Kaggle.
We assume it has already been registered as an Azure ML dataset.

In [30]:
dataset_name = 'edu_heart_failure_dataset'
dataset = Dataset.get_by_name(workspace, name=dataset_name)

In [31]:
# Make a dataframe and take a look at it
patients = dataset.to_pandas_dataframe()
patients.head()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
0,75.0,0,582,0,20,1,265000.0,1.9,130,1,0,4,1
1,55.0,0,7861,0,38,0,263358.03,1.1,136,1,0,6,1
2,65.0,0,146,0,20,0,162000.0,1.3,129,1,1,7,1
3,50.0,1,111,0,20,0,210000.0,1.9,137,1,0,7,1
4,65.0,1,160,1,20,0,327000.0,2.7,116,0,0,8,1


## Hyperdrive Configuration

We're using a random forest (RF) classifier, because RF tend to generate reasonable predictions across a wide range of data while requiring little configuration.

We're letting HyperDrive select the best combination of the hyperparameters `n_estimators`, the number of trees in the forest, and `min_samples_split`, the minimum fraction of samples required to split an internal node.

We're using a Bandit early termination policy, which ends runs when the primary metric isn't within the specified slack factor of the most successful run.

Our primary metric is mean accuracy, which training should maximize.

In [32]:
primary_metric_name = "mean accuracy"

venv = Environment.from_pip_requirements(name="venv", file_path="requirements.txt")

train_cfg = ScriptRunConfig(
    source_directory="steps",
    script="train.py",
    environment=venv,
    compute_target=compute_target,
)

param_sampling = RandomParameterSampling({
    "n_estimators": choice(20, 50, 100, 200),
    "min_samples_split": loguniform(-6, -2),
})

early_termination_policy = BanditPolicy(slack_factor=0.2)

hyperdrive_run_config = HyperDriveConfig(
    run_config=train_cfg,
    hyperparameter_sampling=param_sampling,
    policy=early_termination_policy,
    primary_metric_name=primary_metric_name,
    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
    max_total_runs=100,
    max_concurrent_runs=4
)

In [33]:
hyperdrive_run = experiment.submit(hyperdrive_run_config)

## Run Details

In the cell below, we use the `RunDetails` widget to show the different experiments.

In [34]:
RunDetails(hyperdrive_run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

## Best Model

In the cells below, we get the best model from the hyperdrive experiments and display all the properties of the model.

In [37]:
best_run = hyperdrive_run.get_best_run_by_primary_metric()
print(f"Best run id: {best_run.id}")

Best run id: HD_6e40d050-a942-4999-a542-f6090badbf9e_28


In [38]:
best_run_metrics = best_run.get_metrics()
print(best_run_metrics)

{'Number of trees in the forest': 20, 'Minimum fraction of samples required to split an internal node': 0.018706869030727838, 'mean accuracy': 0.92}


## Model Deployment

We deploy the best model that comes out of the Hyperdrive run.

In the cells below, we register the model, create an inference config and deploy the model as a web service.

In [40]:
model_name = "hyperdrive_best_model"
model = best_run.register_model(model_name=model_name, model_path="outputs")

In [41]:
deployment_config = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1,
)

inference_config = InferenceConfig(
    entry_script='score.py',
    environment=venv,
)

service = Model.deploy(
    workspace,
    "edu-service",
    [model],
    inference_config,
    deployment_config,
    overwrite=True,
)

service.wait_for_deployment(show_output=True)

azureml.core.model:
To leverage new model deployment capabilities, AzureML recommends using CLI/SDK v2 to deploy models as online endpoint, 
please refer to respective documentations 
https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoints /
https://docs.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-anywhere 
For more information on migration, see https://aka.ms/acimoemigration 


Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2023-08-02 21:13:28+00:00 Creating Container Registry if not exists.
2023-08-02 21:13:28+00:00 Registering the environment.
2023-08-02 21:13:29+00:00 Use the existing image.
2023-08-02 21:13:29+00:00 Generating deployment configuration.
2023-08-02 21:13:29+00:00 Submitting deployment to compute.
2023-08-02 21:13:31+00:00 Checking the status of deployment edu-service..
2023-08-02 21:17:34+00:00 Checking the status of inference endpoint edu-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


In the cells below, we send a request to the web service we deployed to test it.
As data payload we select five random rows from the dataset.

In [42]:
target_column = "DEATH_EVENT"

json_payload = {
    "data": patients.drop(columns=target_column).sample(n=5).to_dict("records")
}

raw_data = json.dumps(json_payload)
print(raw_data)

{"data": [{"age": 80.0, "anaemia": 1, "creatinine_phosphokinase": 123, "diabetes": 0, "ejection_fraction": 35, "high_blood_pressure": 1, "platelets": 388000.0, "serum_creatinine": 9.4, "serum_sodium": 133, "sex": 1, "smoking": 1, "time": 10}, {"age": 70.0, "anaemia": 0, "creatinine_phosphokinase": 2695, "diabetes": 1, "ejection_fraction": 40, "high_blood_pressure": 0, "platelets": 241000.0, "serum_creatinine": 1.0, "serum_sodium": 137, "sex": 1, "smoking": 0, "time": 247}, {"age": 94.0, "anaemia": 0, "creatinine_phosphokinase": 582, "diabetes": 1, "ejection_fraction": 38, "high_blood_pressure": 1, "platelets": 263358.03, "serum_creatinine": 1.83, "serum_sodium": 134, "sex": 1, "smoking": 0, "time": 27}, {"age": 68.0, "anaemia": 1, "creatinine_phosphokinase": 220, "diabetes": 0, "ejection_fraction": 35, "high_blood_pressure": 1, "platelets": 289000.0, "serum_creatinine": 0.9, "serum_sodium": 140, "sex": 1, "smoking": 1, "time": 20}, {"age": 63.0, "anaemia": 0, "creatinine_phosphokinase"

In [43]:
headers = {"Content-Type": "application/json"}
uri = service.scoring_uri

response = requests.post(uri, data=raw_data, headers=headers)
print(response.json())

[1, 0, 1, 1, 0]


In the cells below, we print the logs of the web service and delete the service.

In [44]:
print(service.get_logs())

/bin/bash: /azureml-envs/azureml_58f440cd5db21b9ab3c363f087e7b355/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /azureml-envs/azureml_58f440cd5db21b9ab3c363f087e7b355/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /azureml-envs/azureml_58f440cd5db21b9ab3c363f087e7b355/lib/libtinfo.so.6: no version information available (required by /bin/bash)
2023-08-02T21:16:51,786422155+00:00 - rsyslog/run 
2023-08-02T21:16:51,795404455+00:00 - gunicorn/run 
2023-08-02T21:16:51,797069355+00:00 | gunicorn/run | 
bash: /azureml-envs/azureml_58f440cd5db21b9ab3c363f087e7b355/lib/libtinfo.so.6: no version information available (required by bash)
2023-08-02T21:16:51,804790955+00:00 | gunicorn/run | ###############################################
2023-08-02T21:16:51,814796455+00:00 | gunicorn/run | AzureML Container Runtime Information
2023-08-02T21:16:51,818501955+00:00 - nginx/run 
2023-08-02T21:16:51,823134655+00:00 | gunico

In [45]:
service.delete()
model.delete()

In [47]:
compute_target.delete()

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.

