# Hyperparameter Tuning of QRNN Models with AML SDK and HyperDrive

This notebook performs hyperparameter tuning of QRNN models with AML SDK and HyperDrive. It selects the best model by cross validation using the training data in the 6 forecast round. Specifically, it splits the training data into sub-training data and validation data. Then, it trains QRNN models with different sets of hyperparameters using the sub-training data and evaluate the pinball loss of each model with the validation data. The set of hyperparameters which yield the best cross validation pinball loss will be used to train models and forecast energy load across all 6 forecast rounds.

## Prerequisites
To run this notebook, you need to install AML SDK and its widget extension in your environment by running the following commands in a terminal. Before running the commands, you need to activate your environment by executing `activate <your env>` or `source activate <your env>` in a Linux VM.   
`pip3 install --upgrade azureml-sdk[notebooks,automl]`  
`jupyter nbextension install --py --user azureml.train.widgets`  
`jupyter nbextension enable --py --user azureml.train.widgets`  

To add the environment to your Jupyter kernels, you can do python3 -m ipykernel install --name <your env>. Besides, you need to create an Azure ML workspace and its configuration file (config.json) by following the 00.configuration.ipynb notebook.

In [None]:
import azureml
from azureml.core import Workspace, Run

# Check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

In [None]:
from azureml.telemetry import set_diagnostics_collection

# Opt-in diagnostics for better experience of future releases
set_diagnostics_collection(send_diagnostics=True)

## Initialize Workspace & Create an Azure ML Experiment

Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` below creates a workspace object from the details stored in `config.json`.

In [None]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

In [None]:
from azureml.core import Experiment

exp = Experiment(workspace=ws, name='tune_qrnn')

## Validate Script Locally

In [None]:
from azureml.core.runconfig import RunConfiguration

# Configure local, user managed environment
run_config_user_managed = RunConfiguration()
run_config_user_managed.environment.python.user_managed_dependencies = True
run_config_user_managed.environment.python.interpreter_path = '/anaconda/envs/py36/python.exe'

In [None]:
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='./', 
                      script='aml_estimator.py', 
                      arguments=['--path', './data/features/train',
                                 '--cv_path', './',
                                 '--n_hidden_1', '5',
                                 '--n_hidden_2', '5',
                                 '--iter_max', '3',
                                 '--penalty', '0'],
                      run_config=run_config_user_managed)

run_local = exp.submit(src)

In [None]:
# Check job status
run_local.fail

In [None]:
# Check results
run_local.get_details()

In [None]:
run_local.get_metrics()

## Run Script on Batch AI Compute Target

### Create Batch AI cluster as compute target

In [None]:
from azureml.core.compute import ComputeTarget, BatchAiCompute
from azureml.core.compute_target import ComputeTargetException

# choose a name for your cluster
cluster_name =  "zhouftsperfqrnn"
cluster_min_nodes = 0
cluster_max_nodes = 16

vm_size = "STANDARD_D3_V2"

try:
    # Look for the existing cluster by name
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    if type(compute_target) is BatchAiCompute:
        print('Found existing compute target {}.'.format(cluster_name))
    else:
        print('{} exists but it is not a Batch AI Compute target. Please choose a different name.'.format(cluster_name))
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = BatchAiCompute.provisioning_configuration(vm_size=vm_size,
                                                                #vm_priority='lowpriority', # optional
                                                                autoscale_enabled=True,
                                                                cluster_min_nodes=cluster_min_nodes, 
                                                                cluster_max_nodes=cluster_max_nodes)

    # Create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)
    
    # Can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it uses the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
    # Use the 'status' property to get a detailed status for the current cluster. 
    print(compute_target.status.serialize())

In [None]:
# If you have created the compute target, you should see one entry named 'gpucluster' of type BatchAI 
# in the workspace's compute_targets property.
#compute_targets = ws.compute_targets
#for name, ct in compute_targets.items():
#    print(name, ct.type, ct.provisioning_state)

### Configure Docker environment

In [None]:
from azureml.core.runconfig import EnvironmentDefinition
from azureml.core.conda_dependencies import CondaDependencies

env = EnvironmentDefinition()

env.python.user_managed_dependencies = False
env.python.conda_dependencies = CondaDependencies.create(conda_packages=['pandas', 'r-base', 'r-data.table', 'r-rjson', 'r-optparse'],
                                                         python_version='3.6.2')
env.python.conda_dependencies.add_channel('conda-forge')
env.docker.enabled=True

### Upload data to default datastore

Upload the 6 round train data of Energy dataset to the workspace's default datastore, which will later be mounted on a AML Compute target for training. 

In [None]:
ds = ws.get_default_datastore()
print(ds.datastore_type, ds.account_name, ds.container_name)

In [None]:
path_on_datastore = 'data'
ds.upload(src_dir='./data/features/train', target_path=path_on_datastore, overwrite=True, show_progress=True)

In [None]:
# Get data reference object for the data path
ds_data = ds.path(path_on_datastore)
print(ds_data)

### Create estimator

In [None]:
from azureml.core.runconfig import EnvironmentDefinition
from azureml.train.estimator import Estimator

script_folder = './'

script_params = {
    '--path': ds_data.as_mount(),
    '--cv_path': './',
    '--n_hidden_1': 5, 
    '--n_hidden_2': 5,
    '--iter_max': 3,
    '--penalty': 0
}

est = Estimator(source_directory=script_folder,
                script_params=script_params,
                compute_target=compute_target,
                use_docker=True,
                entry_script='aml_estimator.py',
                environment_definition=env)

### Submit job

In [None]:
# Submit job to Batch AI cluster
run_batchai = exp.submit(config=est)

### Check job status

In [None]:
from azureml.train.widgets import RunDetails

RunDetails(run_batchai).show()

In [None]:
run_batchai.get_details()

In [None]:
run_batchai.get_metrics()

## Tune Hyperparameters using HyperDrive

In [None]:
from azureml.train.hyperdrive import *

script_folder = './'

script_params = {
    '--path': ds_data.as_mount(),
    '--cv-folder': './'
}

est = Estimator(source_directory=script_folder,
                script_params=script_params,
                compute_target=compute_target,
                use_docker=True,
                entry_script='aml_estimator.py',
                environment_definition=env)

ps = RandomParameterSampling({
    '--n_hidden_1': choice(5, 10), 
    '--n_hidden_2': choice(5, 10),
    '--iter_max': choice(2, 4, 6, 8, 10),
    '--penalty': choice(0),
})

htc = HyperDriveRunConfig(estimator=est, 
                          hyperparameter_sampling=ps, 
                          primary_metric_name='average pinball loss', 
                          primary_metric_goal=PrimaryMetricGoal.MINIMIZE, 
                          max_total_runs=20,
                          max_concurrent_runs=4)

htr = exp.submit(config=htc)

In [None]:
RunDetails(htr).show()

In [None]:
htr.get_details()

In [None]:
htr.get_metrics()

In [None]:
best_run = htr.get_best_run_by_primary_metric()
parameter_values = best_run.get_details()['runDefinition']['Arguments']
print(parameter_values)