# Hyperparameter Tuning using HyperDrive

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [None]:
#!pip install --upgrade azureml-sdk
#!pip install --upgrade azureml-core

In [None]:
!pip list

In [1]:
import azureml.core
from azureml.core import Workspace, Experiment, Dataset, ScriptRunConfig
from azureml.core import Environment
from azureml.core.environment import CondaDependencies
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.exceptions import ComputeTargetException
from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveConfig, PrimaryMetricGoal
from azureml.train.hyperdrive import choice
import pandas as pd
import os
import shutil
from azureml.widgets import RunDetails

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.51.0


In [2]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

script_folder = './hyper_data'
os.makedirs(script_folder, exist_ok=True)

# Move the train.py file to the script_folder
src_file = "train.py"
if os.path.exists(src_file):
    dest_file = os.path.join(script_folder, "train.py")
    shutil.move(src_file,dest_file)
    print(f"Moved {src_file} to {script_folder}")
else:
    print(f"{src_file} not found in the root directory. Nothing to do.")


experiment_name = 'hyper_drive_exp'
exp=Experiment(ws, experiment_name)

run = exp.start_logging()

quick-starts-ws-239958
aml-quickstarts-239958
southcentralus
6971f5ac-8af1-446e-8034-05acea24681f
Moved train.py to ./hyper_data


In [3]:
# Create the cluster
cluster_name = "auto-ml"
try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing cluster')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)
    compute_target.wait_for_completion(show_output=True)

InProgress..
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


## Dataset

TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

In [4]:
import pandas as pd
# Create AML Dataset and register it into Workspace
key='car evaluation data set'
try:
    dataset = Dataset.get_by_name(ws, name=key)
    print("Dataset found.")
except Exception as e:
    print("Dataset not found. Loading data from URL...")
    data = 'https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data'
    df = pd.read_csv(data)
    columns = ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'class']
    df.columns = columns
    # Convert the DataFrame to a TabularDataset
    dataset = Dataset.Tabular.register_pandas_dataframe(
        dataframe=df, 
        target=(ws.get_default_datastore(), key), 
        name=key, 
        description='car evaluation data set')
    print(df.head())

Dataset not found. Loading data from URL...
Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to car evaluation data set/19a4ef56-faeb-435f-9c05-a650b8f53dcf/
Successfully uploaded file to datastore.
Creating and registering a new dataset.
Successfully created and registered a new dataset.
  buying  maint doors persons lug_boot safety  class
0  vhigh  vhigh     2       2    small    med  unacc
1  vhigh  vhigh     2       2    small   high  unacc
2  vhigh  vhigh     2       2      med    low  unacc
3  vhigh  vhigh     2       2      med    med  unacc
4  vhigh  vhigh     2       2      med   high  unacc


## Hyperdrive Configuration

TODO: Explain the model you are using and the reason for chosing the different hyperparameters, termination policy and config settings.

model:RamdomForests

    Random Forests is an ensemble learning method that combines multiple decision trees. It can exhibit strong classification performance on datasets that include categorical data. By combining multiple decision trees, it helps mitigate overfitting and improves generalization performance.
    
hyperparameters:

    n_estimators: the model's complexity and expressive power
    
    min_samples_split: the minimum number of samples required for a split node. affect to the model's generalization performance.
    
    min_samples_leaf: the minimum number of samples required for a leaf node.
    
    
tarmination policy:

    slack_factor; triggers early termination if the performance of the current run is more than 15% worse than the best performing run.
    
    evaluation_interval;The progress is evaluated at each to make decisions for early termination.
    
    delay_evaluation; to avoid the possibility of the early termination policy reaching the termination condition before the first evaluation.
    
config setting:
    

In [14]:
# TODO: Create an early termination policy. This is not required if you are using Bayesian sampling.
early_termination_policy = BanditPolicy(slack_factor=0.15, evaluation_interval=1, delay_evaluation=10)

#TODO: Create the different params that you will be using during training
param_sampling = RandomParameterSampling({
    "--n_estimators": choice(100, 500, 1000),
    "--min_samples_split": choice(2, 10, 20),
    "--min_samples_leaf": choice(1, 5, 10),
})

#TODO: Create your estimator and hyperdrive config
env=Environment.get(workspace=ws, name="AzureML-Tutorial")

src = ScriptRunConfig(
    source_directory=script_folder+"/",
    script="train.py",
    compute_target=compute_target,
    environment=env
)

hyperdrive_config = HyperDriveConfig(
    run_config=src,
    hyperparameter_sampling=param_sampling,
    policy=early_termination_policy,
    primary_metric_name="auc_weighted",
    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
    max_total_runs=100,
    max_concurrent_runs=4)

In [15]:
#TODO: Submit your experiment
hyperdrive_run = exp.submit(hyperdrive_config, show_output=True)


## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [16]:
RunDetails(hyperdrive_run).show()
hyperdrive_run.wait_for_completion(show_output=True)

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

RunId: HD_c633875c-024f-4842-906a-3b7c0edd25af
Web View: https://ml.azure.com/runs/HD_c633875c-024f-4842-906a-3b7c0edd25af?wsid=/subscriptions/6971f5ac-8af1-446e-8034-05acea24681f/resourcegroups/aml-quickstarts-239958/workspaces/quick-starts-ws-239958&tid=660b3398-b80e-49d2-bc5b-ac1dc93b5254

Streaming azureml-logs/hyperdrive.txt

[2023-08-09T02:08:26.310601][GENERATOR][INFO]Trying to sample '4' jobs from the hyperparameter space
[2023-08-09T02:08:26.9412782Z][SCHEDULER][INFO]Scheduling job, id='HD_c633875c-024f-4842-906a-3b7c0edd25af_0' 
[2023-08-09T02:08:26.9972149Z][SCHEDULER][INFO]Scheduling job, id='HD_c633875c-024f-4842-906a-3b7c0edd25af_1' 
[2023-08-09T02:08:27.1197073Z][SCHEDULER][INFO]Scheduling job, id='HD_c633875c-024f-4842-906a-3b7c0edd25af_2' 
[2023-08-09T02:08:27.2551399Z][SCHEDULER][INFO]Scheduling job, id='HD_c633875c-024f-4842-906a-3b7c0edd25af_3' 
[2023-08-09T02:08:27.199094][GENERATOR][INFO]Successfully sampled '4' jobs, they will soon be submitted to the execution t

ActivityFailedException: ActivityFailedException:
	Message: Activity Failed:
{
    "error": {
        "code": "UserError",
        "message": "Execution failed. User process '/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/bin/python' exited with status code 1. Please check log file 'user_logs/std_log.txt' for error details. Error: Traceback (most recent call last):\n  File \"train.py\", line 66, in <module>\n    accuracy = accuracy_score(y_test, y_prob)\n  File \"/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/lib/python3.6/site-packages/sklearn/metrics/_classification.py\", line 185, in accuracy_score\n    y_type, y_true, y_pred = _check_targets(y_true, y_pred)\n  File \"/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/lib/python3.6/site-packages/sklearn/metrics/_classification.py\", line 90, in _check_targets\n    \"and {1} targets\".format(type_true, type_pred))\nValueError: Classification metrics can't handle a mix of multiclass and continuous-multioutput targets\n\n Marking the experiment as failed because initial child jobs have failed due to user error",
        "messageParameters": {},
        "details": []
    },
    "time": "0001-01-01T00:00:00.000Z"
}
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Activity Failed:\n{\n    \"error\": {\n        \"code\": \"UserError\",\n        \"message\": \"Execution failed. User process '/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/bin/python' exited with status code 1. Please check log file 'user_logs/std_log.txt' for error details. Error: Traceback (most recent call last):\\n  File \\\"train.py\\\", line 66, in <module>\\n    accuracy = accuracy_score(y_test, y_prob)\\n  File \\\"/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/lib/python3.6/site-packages/sklearn/metrics/_classification.py\\\", line 185, in accuracy_score\\n    y_type, y_true, y_pred = _check_targets(y_true, y_pred)\\n  File \\\"/azureml-envs/azureml_1296d9ccb6d6509a0126eeef4e26fcc9/lib/python3.6/site-packages/sklearn/metrics/_classification.py\\\", line 90, in _check_targets\\n    \\\"and {1} targets\\\".format(type_true, type_pred))\\nValueError: Classification metrics can't handle a mix of multiclass and continuous-multioutput targets\\n\\n Marking the experiment as failed because initial child jobs have failed due to user error\",\n        \"messageParameters\": {},\n        \"details\": []\n    },\n    \"time\": \"0001-01-01T00:00:00.000Z\"\n}"
    }
}

KeyError: 'log_files'

## Best Model

TODO: In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

In [None]:
best_run = hyperdrive_run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
print("Best Run ID: ", best_run.id)
print("auc_weighted: ", best_run_metrics["auc_weighted"])

In [None]:
#TODO: Save the best model
model = best_run.register_model(
    model_name='hyperdrive_model',
    model_path=script_folder+'/hyperdrive_model.joblib'
    )
print("Model registered:", model.name, model.id)

## Model Deployment

Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

TODO: In the cell below, send a request to the web service you deployed to test it.

TODO: In the cell below, print the logs of the web service and delete the service

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.

