# Hyperparameter Tuning using HyperDrive

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [169]:
from azureml.core import Workspace, Experiment
from azureml.core import Model 
from azureml.train.automl import AutoMLConfig
from azureml.widgets import RunDetails
from azureml.core.compute import ComputeTarget, AmlCompute
from pprint import pprint 
import joblib
import os

In [170]:
%%writefile train.py
from sklearn.linear_model import LogisticRegression
import sklearn.datasets
import argparse
import os
import numpy as np
import joblib
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense
import pandas as pd
from azureml.core.run import Run
from azureml.data.dataset_factory import TabularDatasetFactory


def make_data():
    data  = sklearn.datasets.load_boston()
    x = data.data
    y = data.target
    return x, y

def main():
    # Add arguments to script
    parser = argparse.ArgumentParser()
    parser.add_argument('--D', type=int, default=20, help="Inverse of regularization strength. Smaller values cause stronger regularization")
    parser.add_argument('--learningrate', type=float, default=0.001, help="Maximum number of iterations to converge")
    args = parser.parse_args()

    run = Run.get_context()
    run.log("Dense Number:", np.float(args.D))
    run.log("Learning Rate:", np.int(args.learningrate))
    x, y = make_data()

    #Split data into train and test sets.
    x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.3)

    #make model
    model = Sequential()
    model.add(Dense(args.D, input_dim=13))
    model.add(Dense(args.D, activation='relu'))
    model.add(Dense(1))
    model.compile(loss="mean_absolute_error", optimizer=Adam(lr=args.learningrate))
    model.summary()
    validation_split_rate=0.2
    history=model.fit(x_train,y_train,batch_size=16,epochs=100,validation_data=(x_test,y_test))

    loss = model.evaluate(x_test, y_test)
    run.log("Loss", np.float(loss))
    import joblib
    # insert this after fitting the model
    # create an output folder
    os.makedirs('outputs', exist_ok=True)
    model.save('outputs/model.h5')

if __name__ == '__main__':
    main()


Overwriting train.py


## Dataset

TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

In [171]:
import sklearn.datasets
from sklearn.model_selection import train_test_split
import random
import pandas as pd
data  = sklearn.datasets.load_boston()
df_X = pd.DataFrame(data.data, columns=data.feature_names)
df_Y = pd.DataFrame(data.target, columns=['PRICE'])
print(df_X)
print(df_Y)

        CRIM    ZN  INDUS  CHAS    NOX     RM   AGE     DIS  RAD    TAX  \
0    0.00632  18.0   2.31   0.0  0.538  6.575  65.2  4.0900  1.0  296.0   
1    0.02731   0.0   7.07   0.0  0.469  6.421  78.9  4.9671  2.0  242.0   
2    0.02729   0.0   7.07   0.0  0.469  7.185  61.1  4.9671  2.0  242.0   
3    0.03237   0.0   2.18   0.0  0.458  6.998  45.8  6.0622  3.0  222.0   
4    0.06905   0.0   2.18   0.0  0.458  7.147  54.2  6.0622  3.0  222.0   
..       ...   ...    ...   ...    ...    ...   ...     ...  ...    ...   
501  0.06263   0.0  11.93   0.0  0.573  6.593  69.1  2.4786  1.0  273.0   
502  0.04527   0.0  11.93   0.0  0.573  6.120  76.7  2.2875  1.0  273.0   
503  0.06076   0.0  11.93   0.0  0.573  6.976  91.0  2.1675  1.0  273.0   
504  0.10959   0.0  11.93   0.0  0.573  6.794  89.3  2.3889  1.0  273.0   
505  0.04741   0.0  11.93   0.0  0.573  6.030  80.8  2.5050  1.0  273.0   

     PTRATIO       B  LSTAT  
0       15.3  396.90   4.98  
1       17.8  396.90   9.14  
2       1

In [172]:
ws = Workspace.from_config()
experiment_name = 'exp_hyperdrive'
experiment=Experiment(ws, experiment_name)
run = experiment.start_logging()

In [173]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
cluster_name = "udacity-cluster"

#: Create compute cluster
# Use vm_size = "Standard_D2_V2" in your provisioning configuration.
# max_nodes should be no greater than 4.

### YOUR CODE HERE ###
vm_size = "Standard_D2_V2"
# Check if the cluster already exists, and create it if it doesn't
try:
    compute_cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print("Using existing compute cluster.")
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size=vm_size, max_nodes=4)
    compute_cluster = ComputeTarget.create(ws, cluster_name, compute_config)

compute_cluster.wait_for_completion(show_output=True)

InProgress.
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


In [174]:
%%writefile conda_dependencies.yml
dependencies:
  - python=3.6.2
  - tensorflow==2.0.0
  - scikit-learn
  - numpy
  - pandas
  - pip:
    - azureml-defaults

Overwriting conda_dependencies.yml


## Hyperdrive Configuration

TODO: Explain the model you are using and the reason for chosing the different hyperparameters, termination policy and config settings.

In this project, MLP model is used for prediction.
And I swing the two parameters that is number of dense matrix and learning raet.
This is because the two parameters is affect to the accuracy of model.

And I set early stopping policy strictly since this is not real project and I want to reduce learing cost.

This target metrix of this project is loss of test data.
The hypderdrive would try to minimize this.

In [175]:
from azureml.widgets import RunDetails
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.policy import BanditPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import choice, uniform
from azureml.core import Environment, ScriptRunConfig
import os
# TODO: Create an early termination policy. This is not required if you are using Bayesian sampling.
early_termination_policy = BanditPolicy(slack_factor = 0.1, evaluation_interval = 1, delay_evaluation= 5 )

if "training" not in os.listdir():
    os.mkdir("./training")

#TODO: Create the different params that you will be using during training
param_sampling = RandomParameterSampling(parameter_space={
    '--D': choice(10, 20, 50, 100),
    '--learningrate' : choice(0.01,0.001,0.0001)})

# Setup environment for your training run
tf_env = Environment.from_conda_specification(name='tf-env', file_path='conda_dependencies.yml')

#TODO: Create your estimator and hyperdrive config
estimator = ScriptRunConfig(source_directory=".",
                      script="train.py",
                      compute_target=cluster_name,
                      environment=tf_env)

hyperdrive_run_config = HyperDriveConfig(
    run_config = estimator, 
    hyperparameter_sampling = param_sampling,
    policy = early_termination_policy,
    primary_metric_name = "Loss",
    primary_metric_goal = PrimaryMetricGoal.MINIMIZE,
    max_total_runs = 6,
    max_concurrent_runs = 4
    )

In [176]:
#TODO: Submit your experiment
hyperdrive_run = experiment.submit(hyperdrive_run_config, show_output=True)

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [177]:
RunDetails(hyperdrive_run).show()
hyperdrive_run.wait_for_completion(show_output=True)

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

RunId: HD_51552781-bb2d-4218-9203-c1a36be3f3d8
Web View: https://ml.azure.com/runs/HD_51552781-bb2d-4218-9203-c1a36be3f3d8?wsid=/subscriptions/8d1a13c1-dda4-4fdf-a927-e08a4213f4e3/resourcegroups/resoure-udacity/workspaces/ws-udacity&tid=71373672-464e-4725-990c-20c60c08821e

Streaming azureml-logs/hyperdrive.txt

[2023-07-07T05:31:59.917273][GENERATOR][INFO]Trying to sample '4' jobs from the hyperparameter space
[2023-07-07T05:32:00.5912200Z][SCHEDULER][INFO]Scheduling job, id='HD_51552781-bb2d-4218-9203-c1a36be3f3d8_0' 
[2023-07-07T05:32:00.7012997Z][SCHEDULER][INFO]Scheduling job, id='HD_51552781-bb2d-4218-9203-c1a36be3f3d8_1' 
[2023-07-07T05:32:00.8586529Z][SCHEDULER][INFO]Scheduling job, id='HD_51552781-bb2d-4218-9203-c1a36be3f3d8_2' 
[2023-07-07T05:32:01.0344302Z][SCHEDULER][INFO]Scheduling job, id='HD_51552781-bb2d-4218-9203-c1a36be3f3d8_3' 
[2023-07-07T05:32:01.0344734Z][SCHEDULER][INFO]Successfully scheduled a job. Id='HD_51552781-bb2d-4218-9203-c1a36be3f3d8_0' 
[2023-07-07T05:3

{'runId': 'HD_51552781-bb2d-4218-9203-c1a36be3f3d8',
 'target': 'udacity-cluster',
 'status': 'Completed',
 'startTimeUtc': '2023-07-07T05:31:58.961826Z',
 'endTimeUtc': '2023-07-07T05:38:04.953666Z',
 'services': {},
 'properties': {'primary_metric_config': '{"name":"Loss","goal":"minimize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': 'cb6a199c-4602-48e0-b40f-94c6ca428013',
  'user_agent': 'python/3.8.5 (Linux-5.15.0-1035-azure-x86_64-with-glibc2.10) msrest/0.7.1 Hyperdrive.Service/1.0.0 Hyperdrive.SDK/core.1.49.0',
  'space_size': '12',
  'score': '3.3897126850328947',
  'best_child_run_id': 'HD_51552781-bb2d-4218-9203-c1a36be3f3d8_0',
  'best_metric_status': 'Succeeded',
  'best_data_container_id': 'dcid.HD_51552781-bb2d-4218-9203-c1a36be3f3d8_0'},
 'inputDatasets': [],
 'outputDatasets': [],
 'runDefinition': {'configuration': None,
  'attribution': None,
  'telemetryValues': {'amlClientT

## Best Model

TODO: In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

In [191]:
best_run = hyperdrive_run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
parameter_values = best_run.get_details()['runDefinition']['arguments']
print(best_run_metrics)
print(parameter_values)
print(best_run)

{'Dense Number:': 20.0, 'Learning Rate:': 0, 'Loss': 3.3897126850328947}
['--D', '20', '--learningrate', '0.01']
Run(Experiment: exp_hyperdrive,
Id: HD_51552781-bb2d-4218-9203-c1a36be3f3d8_0,
Type: azureml.scriptrun,
Status: Completed)


In [181]:
best_run.get_file_names()
index=best_run.get_file_names().index('outputs/model.h5')
print(index)

1
Current provisioning state of AmlCompute is "Deleting"

Current provisioning state of AmlCompute is "Deleting"

Current provisioning state of AmlCompute is "Deleting"

Current provisioning state of AmlCompute is "Deleting"



In [185]:
#TODO: Save the best model
best_run.download_file(best_run.get_file_names()[index], output_file_path='./outputs/')
best_run.register_model(model_path='outputs/model.h5',model_name='hyperdrive_best')

Model(workspace=Workspace.create(name='ws-udacity', subscription_id='8d1a13c1-dda4-4fdf-a927-e08a4213f4e3', resource_group='resoure-udacity'), name=hyperdrive_best, id=hyperdrive_best:1, version=1, tags={}, properties={})

## Model Deployment

Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [186]:
%%writefile score_boston.py
import json
import numpy as np
import os
from tensorflow.keras.models import load_model

def init():
    global model
    # AZUREML_MODEL_DIR is an environment variable created during deployment.
    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)
    # For multiple models, it points to the folder containing all deployed models (./azureml-models)
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.h5')
    model = load_model(model_path)

def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    y_pred = model.predict(data)
    # you can return any data type as long as it is JSON-serializable
    return json.dumps(str(y_pred[0]))

Overwriting score_boston.py


In [187]:
from azureml.core.webservice import AciWebservice
from azureml.core.model import Model
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=4, 
                                               description='Predict Boston House Price')

inference_config = InferenceConfig(entry_script="score_boston.py", environment=tf_env)
service_name = 'udacity-report3-hyp3'
model=ws.models['hyperdrivebest']

service = Model.deploy(workspace=ws, 
                       name=service_name, 
                       models=[model],
                       inference_config=inference_config,
                       deployment_config=aciconfig)
service.wait_for_deployment(show_output=True)

azureml.core.model:
To leverage new model deployment capabilities, AzureML recommends using CLI/SDK v2 to deploy models as online endpoint, 
please refer to respective documentations 
https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoints /
https://docs.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-anywhere 
For more information on migration, see https://aka.ms/acimoemigration 


Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2023-07-07 05:41:24+00:00 Creating Container Registry if not exists.
2023-07-07 05:41:24+00:00 Registering the environment.
2023-07-07 05:41:25+00:00 Use the existing image.
2023-07-07 05:41:26+00:00 Submitting deployment to compute.
2023-07-07 05:41:31+00:00 Checking the status of deployment udacity-report3-hyp3..
2023-07-07 05:43:34+00:00 Checking the status of inference endpoint udacity-report3-hyp3.
Succeeded
ACI service creation operation finished, operation "Succeeded"


TODO: In the cell below, send a request to the web service you deployed to test it.

In [189]:
import numpy as np
import requests
import json

data  = sklearn.datasets.load_boston()
x = data.data[0]

endpoint='http://a4a50760-3fc6-4bb9-9d04-cd48c291a476.japaneast.azurecontainer.io/score'
input_data=[x.tolist()]
headers = {'Content-Type':'application/json'}
input_json=json.dumps({"data":input_data})
req=requests.post(endpoint,input_json,headers=headers)
pred=json.loads(req.json())
print('predict='+pred)

predict=[33.91492]


TODO: In the cell below, print the logs of the web service and delete the service

In [190]:
print(service.get_logs())
service.delete()
compute_cluster.delete()

2023-07-07T05:43:19,490016791+00:00 - gunicorn/run 
2023-07-07T05:43:19,491194718+00:00 | gunicorn/run | 
2023-07-07T05:43:19,492369345+00:00 | gunicorn/run | ###############################################
2023-07-07T05:43:19,484958676+00:00 - rsyslog/run 
2023-07-07T05:43:19,497657665+00:00 | gunicorn/run | AzureML Container Runtime Information
2023-07-07T05:43:19,484753271+00:00 - iot-server/run 
2023-07-07T05:43:19,500222424+00:00 | gunicorn/run | ###############################################
2023-07-07T05:43:19,508604515+00:00 | gunicorn/run | 
2023-07-07T05:43:19,511593383+00:00 | gunicorn/run | 
2023-07-07T05:43:19,524263672+00:00 | gunicorn/run | AzureML image information: openmpi4.1.0-ubuntu20.04, Materializaton Build:20230120.v2
2023-07-07T05:43:19,526843630+00:00 | gunicorn/run | 
2023-07-07T05:43:19,532217753+00:00 | gunicorn/run | 
2023-07-07T05:43:19,532820267+00:00 - nginx/run 
2023-07-07T05:43:19,534219199+00:00 | gunicorn/run | PATH environment variable: /azureml-env

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.

