# Hyperparameter Tuning using HyperDrive

Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [48]:
from azureml.core import Workspace, Experiment, Environment, ScriptRunConfig
from azureml.data.dataset_factory import TabularDatasetFactory as tdf
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.widgets import RunDetails
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.policy import BanditPolicy, MedianStoppingPolicy
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.parameter_expressions import uniform, choice
import os
import joblib

## Set Up

### Load Workspace Elements, Create an Experiment

In [19]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'MS-Malware-Hyper'
project_folder = '.'

experiment=Experiment(ws, experiment_name)

print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Workspace name: quick-starts-ws-133708
Azure region: southcentralus
Subscription id: 5a4ab2ba-6c51-4805-8155-58759ad589d8
Resource group: aml-quickstarts-133708


### Create/Get Compute Cluster

In [8]:
cpu_cluster_name = "malware-compute"

# if cluster already exists, use it
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Cluster {} exists. Will use this cluster.'.format(cpu_cluster_name))
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)
    
    cpu_cluster.wait_for_completion(show_output=True)

Cluster malware-compute exists. Will use this cluster.


### Create an Environment

In [16]:
%%writefile hyperdrive_dependencies.yml

dependencies:
- python=3.6.2
- scikit-learn
- pandas 
- numpy
- pip:
    - azureml-defaults
    - xgboost

Overwriting hyperdrive_dependencies.yml


In [17]:
hyperdrive_env = Environment.from_conda_specification(name = 'hyperdrive-env', file_path = './hyperdrive_dependencies.yml')

## Data Set

See the "automl.ipynb" Notebook for Data Set explanation

In [12]:
# Load Data set
data_path = 'https://raw.githubusercontent.com/tybyers/AZMLND_projects/capstone/capstone/data/train_1_10k.csv'
dataset = tdf.from_delimited_files(path=data_path)
dataset.to_pandas_dataframe().head()

Unnamed: 0,ProductName,EngineVersion,AppVersion,AvSigVersion,IsBeta,RtpStateBitfield,IsSxsPassiveMode,DefaultBrowsersIdentifier,AVProductStatesIdentifier,AVProductsInstalled,...,Census_FirmwareVersionIdentifier,Census_IsSecureBootEnabled,Census_IsWIMBootEnabled,Census_IsVirtualDevice,Census_IsTouchEnabled,Census_IsPenCapable,Census_IsAlwaysOnAlwaysConnectedCapable,Wdft_IsGamer,Wdft_RegionIdentifier,HasDetections
0,win8defender,1.1.15100.1,4.18.1807.18075,1.273.1735.0,0,7,0,,53447.0,1.0,...,36144,0,,0,0,0,0.0,0,10,0
1,win8defender,1.1.14600.4,4.13.17134.1,1.263.48.0,0,7,0,,53447.0,1.0,...,57858,0,,0,0,0,0.0,0,8,0
2,win8defender,1.1.15100.1,4.18.1807.18075,1.273.1341.0,0,7,0,,53447.0,1.0,...,52682,0,,0,0,0,0.0,0,3,0
3,win8defender,1.1.15100.1,4.18.1807.18075,1.273.1527.0,0,7,0,,53447.0,1.0,...,20050,0,,0,0,0,0.0,0,3,1
4,win8defender,1.1.15100.1,4.18.1807.18075,1.273.1379.0,0,7,0,,53447.0,1.0,...,19844,0,0.0,0,0,0,0.0,0,1,1


## Hyperdrive Configuration

TODO: Explain the model you are using and the reason for chosing the different hyperparameters, termination policy and config settings.

In [35]:
# TODO: Create an early termination policy. This is not required if you are using Bayesian sampling.
early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

#TODO: Create the different params that you will be using during training
param_sampling = RandomParameterSampling({'--max_depth': choice(range(2,11)),
                                         '--n_estimators': choice(25, 50, 100, 250, 500, 750, 1000),
                                         '--learning_rate': uniform(0, 1.0)})

src = ScriptRunConfig(source_directory=project_folder,
                      script='xgbtrain.py',
#                      arguments=['--kernel', 'linear', '--penalty', 1.0],
                      compute_target=cpu_cluster,
                      environment=hyperdrive_env)

hyperdrive_config = HyperDriveConfig(run_config=src,
                                    hyperparameter_sampling=param_sampling,
                                    policy=early_termination_policy,
                                    primary_metric_name='Accuracy',
                                    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                    max_total_runs=100,
                                    max_concurrent_runs=4)

In [36]:
#TODO: Submit your experiment
hyperdrive_run = experiment.submit(hyperdrive_config)

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [37]:
RunDetails(hyperdrive_run).show()

_HyperDriveWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO'…

In [38]:
hyperdrive_run.wait_for_completion(show_output=True)

RunId: HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3
Web View: https://ml.azure.com/experiments/MS-Malware-Hyper/runs/HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3?wsid=/subscriptions/5a4ab2ba-6c51-4805-8155-58759ad589d8/resourcegroups/aml-quickstarts-133708/workspaces/quick-starts-ws-133708

Streaming azureml-logs/hyperdrive.txt

"<START>[2021-01-06T21:47:11.557952][API][INFO]Experiment created<END>\n""<START>[2021-01-06T21:47:12.063006][GENERATOR][INFO]Trying to sample '4' jobs from the hyperparameter space<END>\n"<START>[2021-01-06T21:47:12.7808343Z][SCHEDULER][INFO]The execution environment is being prepared. Please be patient as it can take a few minutes.<END>"<START>[2021-01-06T21:47:12.390285][GENERATOR][INFO]Successfully sampled '4' jobs, they will soon be submitted to the execution target.<END>\n"<START>[2021-01-06T21:47:44.3026302Z][SCHEDULER][INFO]Scheduling job, id='HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3_0'<END><START>[2021-01-06T21:47:44.3131622Z][SCHEDULER][INFO]Scheduling job, id='

{'runId': 'HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3',
 'target': 'malware-compute',
 'status': 'Completed',
 'startTimeUtc': '2021-01-06T21:47:11.284317Z',
 'endTimeUtc': '2021-01-06T22:51:54.183378Z',
 'properties': {'primary_metric_config': '{"name": "Accuracy", "goal": "maximize"}',
  'resume_from': 'null',
  'runTemplate': 'HyperDrive',
  'azureml.runsource': 'hyperdrive',
  'platform': 'AML',
  'ContentSnapshotId': '6ab9c927-95d9-4eef-833e-aba276f792bd',
  'score': '0.6242000000000001',
  'best_child_run_id': 'HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3_54',
  'best_metric_status': 'Succeeded'},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/hyperdrive.txt': 'https://mlstrg133708.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_0396aa6e-9da7-4b82-8028-9ba4be1fb1e3/azureml-logs/hyperdrive.txt?sv=2019-02-02&sr=b&sig=reyGCQvSEUu6p0CDXr9kfkjeeVHlLfi01ocQfk1rUTA%3D&st=2021-01-06T22%3A42%3A03Z&se=2021-01-07T06%3A52%3A03Z&sp=r'}}

In [39]:
assert(hyperdrive_run.get_status() == "Completed")

## Best Model

TODO: In the cell below, get the best model from the hyperdrive experiments and display all the properties of the model.

In [41]:
best_run = hyperdrive_run.get_best_run_by_primary_metric()
best_run.get_metrics()

{'max_depth:': 2,
 'n_estimators:': 100,
 'learning_rate:': 0.112330951408389,
 'Accuracy': 0.6242000000000001}

In [42]:
best_run.properties

{'_azureml.ComputeTargetType': 'amlcompute',
 'ContentSnapshotId': '6ab9c927-95d9-4eef-833e-aba276f792bd',
 'ProcessInfoFile': 'azureml-logs/process_info.json',
 'ProcessStatusFile': 'azureml-logs/process_status.json'}

Save the best model

In [45]:
print(best_run.get_details()['runDefinition'])

{'script': 'xgbtrain.py', 'command': '', 'useAbsolutePath': False, 'arguments': ['--learning_rate', '0.112330951408389', '--max_depth', '2', '--n_estimators', '100'], 'sourceDirectoryDataStore': None, 'framework': 'Python', 'communicator': 'None', 'target': 'malware-compute', 'dataReferences': {}, 'data': {}, 'outputData': {}, 'jobName': None, 'maxRunDurationSeconds': 2592000, 'nodeCount': 1, 'priority': None, 'credentialPassthrough': False, 'environment': {'name': 'hyperdrive-env', 'version': 'Autosave_2021-01-06T21:31:18Z_0dc07082', 'python': {'interpreterPath': 'python', 'userManagedDependencies': False, 'condaDependencies': {'dependencies': ['python=3.6.2', 'scikit-learn', 'pandas', 'numpy', {'pip': ['azureml-defaults', 'xgboost']}], 'name': 'azureml_523fb8d0d2520274d0060eaa29065899'}, 'baseCondaEnvironment': None}, 'environmentVariables': {'EXAMPLE_ENV_VAR': 'EXAMPLE_VALUE'}, 'docker': {'baseImage': 'mcr.microsoft.com/azureml/intelmpi2018.3-ubuntu16.04:20200821.v1', 'platform': {'

In [46]:
print(best_run.get_file_names())

['azureml-logs/55_azureml-execution-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt', 'azureml-logs/65_job_prep-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt', 'azureml-logs/70_driver_log.txt', 'azureml-logs/75_job_post-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt', 'azureml-logs/process_info.json', 'azureml-logs/process_status.json', 'logs/azureml/102_azureml.log', 'logs/azureml/dataprep/backgroundProcess.log', 'logs/azureml/dataprep/backgroundProcess_Telemetry.log', 'logs/azureml/dataprep/engine_spans_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl', 'logs/azureml/dataprep/python_span_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl', 'logs/azureml/job_prep_azureml.log', 'logs/azureml/job_release_azureml.log']


In [47]:
model = best_run.register_model(model_name = 'best_hyperdrive', model_path='outputs/model.joblib')
print(model.name, model.id, model.version, sep='\t')

UserErrorException: UserErrorException:
	Message: File with path outputs/hd_best.pkl was not found,
available files include: azureml-logs/55_azureml-execution-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/65_job_prep-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/70_driver_log.txt,azureml-logs/75_job_post-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/process_info.json,azureml-logs/process_status.json,logs/azureml/102_azureml.log,logs/azureml/dataprep/backgroundProcess.log,logs/azureml/dataprep/backgroundProcess_Telemetry.log,logs/azureml/dataprep/engine_spans_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl,logs/azureml/dataprep/python_span_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl,logs/azureml/job_prep_azureml.log,logs/azureml/job_release_azureml.log.
	InnerException None
	ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "File with path outputs/hd_best.pkl was not found,\navailable files include: azureml-logs/55_azureml-execution-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/65_job_prep-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/70_driver_log.txt,azureml-logs/75_job_post-tvmps_05cfea2c292899d9542ec34aaf13dfbf40798f53bbcdcd0c67822180782073e8_d.txt,azureml-logs/process_info.json,azureml-logs/process_status.json,logs/azureml/102_azureml.log,logs/azureml/dataprep/backgroundProcess.log,logs/azureml/dataprep/backgroundProcess_Telemetry.log,logs/azureml/dataprep/engine_spans_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl,logs/azureml/dataprep/python_span_l_9e4c46ac-573f-4634-b98d-a24daa38a30a.jsonl,logs/azureml/job_prep_azureml.log,logs/azureml/job_release_azureml.log."
    }
}

## Model Deployment

We opted to deploy the AutoML model because it had a higher accuracy -- 0.63 compared to the best accuracy here of 0.62.