# Hyperparameter Tuning using HyperDrive

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [73]:
import pickle
import json
import urllib.request
import json
import os
import ssl
import requests
from azureml.widgets import RunDetails
from azureml.core import Workspace, Dataset, Experiment
from azureml.core.model import InferenceConfig
from azure.ai.ml.entities import Environment, Model
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.automl import AutoMLConfig
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.core.compute import AmlCompute ,ComputeTarget
from azureml.core.compute_target import ComputeTargetException
from azureml.core.webservice import AciWebservice
from azure.ai.ml.sweep import BanditPolicy
from azureml.train.hyperdrive.parameter_expressions import uniform, choice
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azure.ai.ml import MLClient, command
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

In [74]:
#Create MLClient object, this object is like a Workspace handle

ML_CLIENT = MLClient.from_config(DefaultAzureCredential())

Found the config file in: ./config.json


## Dataset

TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

**The dataset is accutally needed in the train.py script, so please look at that file to check the code**

In [77]:
ws = Workspace.from_config()
experiment_name = 'hyperdrive-exper'

experiment=Experiment(ws, experiment_name)

In [78]:
amlcompute_cluster_name = "notebook247476"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',                                                           
                                                           max_nodes=4)
    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)

compute_target.wait_for_completion(show_output=True)

Found existing cluster, use it.

Running


## Hyperdrive Configuration

TODO: Explain the model you are using and the reason for chosing the different hyperparameters, termination policy and config settings.
I'm using the LinearSVM model, because it suitable with the dataset(classification problem, data is numeric).
The 2 params used are.
+ C
+ max_iter

Because these are two of very importance params of the SVM model. To work futher, I might expand the search space with more parameter and wide range value.

The termination policy is BanditPolicy with `BanditPolicy(slack_factor=0.15, evaluation_interval=1, delay_evaluation=10)`


In [79]:
from azure.ai.ml import command, Input
from azure.ai.ml.sweep import Choice, Uniform, MedianStoppingPolicy

early_termination_policy = BanditPolicy(slack_factor=0.15, evaluation_interval=1, delay_evaluation=10)
command_job_for_sweep = command(
    code="./",
    command="python train.py --dataset_name ${{inputs.dataset_name}} --C ${{inputs.C}} --max_iter ${{inputs.max_iter}}",
    environment=Environment(image="mcr.microsoft.com/azureml/curated/responsibleai-ubuntu20.04-py38-cpu:38"),
    inputs={
        "dataset_name": "IRIS-Dataset",
        #define the search space for hyperparameters
        "C": Uniform(min_value=0.01, max_value=1),
        "max_iter": Choice(values=[50, 100, 200]),
    },
    compute="cpu-cluster",
)

sweep_job = command_job_for_sweep.sweep(
    compute=amlcompute_cluster_name, 
    sampling_algorithm="random",
    primary_metric="Accuracy",
    goal="Maximize",
    early_termination_policy=early_termination_policy
)

sweep_job.set_limits(max_total_trials=20, max_concurrent_trials=10, timeout=7200)
# Specify your experiment details
sweep_job.display_name = "hp-iris"
sweep_job.experiment_name = "hp-iris"
sweep_job.description = "Run a hyperparameter sweep job on Iris dataset."


# submit the sweep


In [51]:
#TODO: Submit your experiment
returned_sweep_job = ML_CLIENT.create_or_update(sweep_job)

Uploading odl_user_247476 (0.5 MBs): 100%|██████████| 497304/497304 [00:00<00:00, 3416504.22it/s]




In [52]:
ML_CLIENT.jobs.stream(returned_sweep_job.name)

RunId: olive_library_qpm1j0g79c
Web View: https://ml.azure.com/runs/olive_library_qpm1j0g79c?wsid=/subscriptions/cdbe0b43-92a0-4715-838a-f2648cc7ad21/resourcegroups/aml-quickstarts-247476/workspaces/quick-starts-ws-247476

Streaming azureml-logs/hyperdrive.txt

[2023-12-31T11:00:20.076742][GENERATOR][INFO]Trying to sample '10' jobs from the hyperparameter space
[2023-12-31T11:00:20.5389318Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_0' 
[2023-12-31T11:00:20.7782580Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_2' 
[2023-12-31T11:00:20.7770264Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_1' 
[2023-12-31T11:00:20.8918988Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_3' 
[2023-12-31T11:00:21.0252133Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_4' 
[2023-12-31T11:00:21.1242149Z][SCHEDULER][INFO]Scheduling job, id='olive_library_qpm1j0g79c_5' 
[2023-12-31T11:00:21.2348949Z][SCHEDULER][INFO]Successfully 

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

**I'm using SDK v2 sweepjob to run which doesn't support RunDetails.**

## Best Model

TODO: In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

## Model Deployment. Model is deploy successfully. Look at picture below (^<>^)

![deployed](../images/hp-model-deployed.png)


![deployed](../images/hp-consume.png)
Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

### The best model is get here. The code below download the best run model to local files. Then, these  files will be use to deploy.

In [53]:
# Download best model to local
DEPLOY_PATH = "hp-deploy"
os.makedirs(DEPLOY_PATH, exist_ok=True)
MODEL_PATH = MODEL_NAME = "model"
ML_CLIENT.jobs.download(name=returned_sweep_job.name, download_path=DEPLOY_PATH, all = True)

Downloading artifact azureml://subscriptions/cdbe0b43-92a0-4715-838a-f2648cc7ad21/resourcegroups/aml-quickstarts-247476/workspaces/quick-starts-ws-247476/datastores/workspaceartifactstore/paths/ExperimentRun/dcid.olive_library_qpm1j0g79c_6/model to hp-deploy/named-outputs/mlflow_log_model_1902316930
Downloading artifact azureml://subscriptions/cdbe0b43-92a0-4715-838a-f2648cc7ad21/resourcegroups/aml-quickstarts-247476/workspaces/quick-starts-ws-247476/datastores/workspaceartifactstore/paths/ExperimentRun/dcid.olive_library_qpm1j0g79c_6/hp to hp-deploy/named-outputs/mlflow_log_model_806180258
Downloading artifact azureml://datastores/workspaceartifactstore/ExperimentRun/dcid.olive_library_qpm1j0g79c_6 to hp-deploy/artifacts
Downloading artifact azureml://datastores/workspaceartifactstore/ExperimentRun/dcid.olive_library_qpm1j0g79c_6 to hp-deploy/hd-artifacts


# And the best model screenshot is HERE :))
![Best model](../images/hp-best-model.png)

### Attach the endpoint and model to workspace
    * Register the endpoint
    * Register the model

In [54]:
from azure.ai.ml.entities import ManagedOnlineEndpoint
import datetime
endpoint_name = f"hp-{datetime.datetime.now().strftime('%m%d%H%M%f')}"
ENDPOINT = ManagedOnlineEndpoint(
    name=endpoint_name
)

In [55]:
# Register the end point
endpoint = ML_CLIENT.begin_create_or_update(ENDPOINT)

In [57]:
import time
time.sleep(2)
print("Waiting endpoint deployed...")
status = ML_CLIENT.online_endpoints.get(endpoint_name).provisioning_state
while status == 'Creating':
    time.sleep(1)
    status = ML_CLIENT.online_endpoints.get(endpoint_name).provisioning_state

if (status == "Succeeded"):
    print(f"Deployed endpoint {endpoint_name} success. Status: {status}")
else:
    print(f"Deployed endpoint {endpoint_name} failed. Status: {status}")

Waiting endpoint deployed...
Deployed endpoint hp-12311105866892 success. Status: Succeeded


In [None]:

model = Model(
        path=f"{DEPLOY_PATH}/artifacts/{MODEL_PATH}",
        name="hp-model-2",
        type=AssetTypes.MLFLOW_MODEL
    )
ML_CLIENT.models.create_or_update(model) 

Uploading model (0.0 MBs): 100%|██████████| 2183/2183 [00:00<00:00, 60471.06it/s]




Model({'job_name': None, 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'hp-model-1', 'description': None, 'tags': {}, 'properties': {}, 'print_as_yaml': True, 'id': '/subscriptions/cdbe0b43-92a0-4715-838a-f2648cc7ad21/resourceGroups/aml-quickstarts-247476/providers/Microsoft.MachineLearningServices/workspaces/quick-starts-ws-247476/models/hp-model-1/versions/1', 'Resource__source_path': None, 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/notebook247476/code/Users/odl_user_247476', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7f99a4212b50>, 'serialize': <msrest.serialization.Serializer object at 0x7f99a4212610>, 'version': '1', 'latest_version': None, 'path': 'azureml://subscriptions/cdbe0b43-92a0-4715-838a-f2648cc7ad21/resourceGroups/aml-quickstarts-247476/workspaces/quick-starts-ws-247476/datastores/workspaceblobstore/paths/LocalUpload/93f6bbbce0ff7f96922faa92

### Create the Deployment object

In [None]:
from azure.ai.ml.entities import ManagedOnlineDeployment, CodeConfiguration
deployment_name = "snow-wolf"
DEPLOYMENT = ManagedOnlineDeployment(
    name=deployment_name,
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_DS3_v2",
    instance_count=1
)

ML_CLIENT.begin_create_or_update(DEPLOYMENT)

Check: endpoint hp-12311105866892 exists


<azure.core.polling._poller.LROPoller at 0x7f998c9dcd00>

...

In [None]:
with open("log.log", "w") as f:

    f.write(ML_CLIENT.online_deployments.get_logs(deployment_name, endpoint_name, 50))

In [None]:
import time
print("Waiting deployment deploying...")
status = ML_CLIENT.online_deployments.get(deployment_name, endpoint_name).provisioning_state
while status == 'Updating':
    time.sleep(1)
    status = ML_CLIENT.online_deployments.get(deployment_name, endpoint_name).provisioning_state

if (status == "Succeeded"):
    print(f"Deployed deployment {deployment_name} success. Status: {status}")
else:
    print(f"Deployed deployment {deployment_name} failed. Status: {status}")

Waiting deployment deploying...
..............................................................................................Deployed deployment snow-wolf success. Status: Succeeded
.

In [133]:
ML_CLIENT.online_deployments.begin_delete(deployment_name, endpoint_name)
ML_CLIENT.online_endpoints.begin_delete(endpoint_name)

<azure.core.polling._poller.LROPoller at 0x7ff86498b490>

..

TODO: In the cell below, send a request to the web service you deployed to test it.

In [72]:


def allowSelfSignedHttps(allowed):
    # bypass the server certificate verification on client side
    if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
        ssl._create_default_https_context = ssl._create_unverified_context

allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.

# Request data goes here
# The example below assumes JSON formatting which may be updated
# depending on the format your endpoint expects.
# More information can be found here:
# https://docs.microsoft.com/azure/machine-learning/how-to-deploy-advanced-entry-script
data =  {
  "input_data": {
    "columns": [
      "SepalLengthCm",
      "SepalWidthCm",
      "PetalLengthCm",
      "PetalWidthCm"
    ],
    "index": [1],
    "data": [[2, 3, 4, 5]]
  },
  "params": {}
}

body = str.encode(json.dumps(data))

url = ML_CLIENT.online_endpoints.get(endpoint_name).scoring_uri
api_key = ML_CLIENT.online_endpoints.get_keys(endpoint_name)

headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key.primary_key), 'azureml-model-deployment': 'snow-wolf' }
req = urllib.request.Request(url, body, headers)

try:
    response = urllib.request.urlopen(req)

    result = response.read()
    print(result)
except urllib.error.HTTPError as error:
    print("The request failed with status code: " + str(error.code))

    # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
    print(error.info())
    print(error.read().decode("utf8", 'ignore'))

b'[2]'


TODO: In the cell below, print the logs of the web service and delete the service

In [None]:
#Get log
ML_CLIENT.online_deployments.get_logs(name="AutoMlDeployment", endpoint_name=endpoint_name, lines=1000)

In [None]:
#delete the service
ML_CLIENT.online_endpoints.begin_delete(name=endpoint_name)

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.
