Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/ml-frameworks/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.png)

# Putting together into a pipelie

## Prerequisites

* Go through the [Configuration](../../../../configuration.ipynb) notebook to install the Azure Machine Learning Python SDK and create an Azure ML Workspace

In [3]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

SDK version: 1.16.0


Opt-in diagnostics for better experience, quality, and security of future releases.

## Initialize workspace

Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`.

In [4]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Workspace name: ws01ent
Azure region: westus2
Subscription id: 0e9bace8-7a81-4922-83b5-d995ff706507
Resource group: azureml


## Create AmlCompute

You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource.

As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.

If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_D2_V2` CPU VMs. This process is broken down into 3 steps:
1. create the configuration (this step is local and only takes a second)
2. create the cluster (this step will take about **20 seconds**)
3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell

In [5]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# choose a name for your cluster
cluster_name = "cluster-train"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D12_V2', 
                                                           max_nodes=4)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it uses the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

# use get_status() to get a detailed status for the current cluster. 
print(compute_target.get_status().serialize())

Found existing compute target
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2020-10-26T18:21:05.010000+00:00', 'errors': None, 'creationTime': '2020-10-26T15:36:47.471716+00:00', 'modifiedTime': '2020-10-26T16:38:11.338154+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT1200S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D12_V2'}


The above code retrieves a CPU compute target. Scikit-learn does not support GPU computing.

## Train model on the remote compute

Now that you have your data and training script prepared, you are ready to train on your remote compute. You can take advantage of Azure compute to leverage a CPU cluster.

### Create a project directory

Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on.

In [6]:
import os

project_folder = './train'
os.makedirs(project_folder, exist_ok=True)

### Prepare training script

Now you will need to create your training script. In this tutorial, the training script is already provided for you at `train_iris`.py. In practice, you should be able to take any custom training script as is and run it with Azure ML without having to modify your code.

However, if you would like to use Azure ML's [tracking and metrics](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#metrics) capabilities, you will have to add a small amount of Azure ML code inside your training script.

In `train_iris.py`, we will log some metrics to our Azure ML run. To do so, we will access the Azure ML Run object within the script:

```python
from azureml.core.run import Run
run = Run.get_context()
```

Further within `train_iris.py`, we log the kernel and penalty parameters, and the highest accuracy the model achieves:

```python
run.log('Kernel type', np.string(args.kernel))
run.log('Penalty', np.float(args.penalty))

run.log('Accuracy', np.float(accuracy))
```

These run metrics will become particularly important when we begin hyperparameter tuning our model in the "Tune model hyperparameters" section.

Once your script is ready, copy the training script `train_iris.py` into your project directory.

### Create an experiment

Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this Scikit-learn tutorial.

In [7]:
from azureml.core import Experiment

experiment_name = 'train_isd'
experiment = Experiment(ws, name=experiment_name)

### Create an environment

Define a conda environment YAML file with your training script dependencies and create an Azure ML environment.

In [8]:
%%writefile conda_dependencies.yml

dependencies:
- python=3.6.2
- scikit-learn
- pip:
  - azureml-defaults

Overwriting conda_dependencies.yml


In [9]:
from azureml.core import Environment

sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './conda_dependencies.yml')

### Configure the training job

Create a ScriptRunConfig object to specify the configuration details of your training job, including your training script, environment to use, and the compute target to run on.

In [10]:
from azureml.train.estimator import Estimator

estimator = Estimator(source_directory=project_folder,
                      entry_script='train_isd.py',
                      compute_target=compute_target,
                      environment_definition=sklearn_env)

Run your experiment by submitting your ScriptRunConfig object. Note that this call is asynchronous.

## Tune model hyperparameters

Now that we've seen how to do a simple Scikit-learn training run using the SDK, let's see if we can further improve the accuracy of our model. We can optimize our model's hyperparameters using Azure Machine Learning's hyperparameter tuning capabilities.

### Start a hyperparameter sweep

First, we will define the hyperparameter space to sweep over. Let's tune the `kernel` and `penalty` parameters. In this example we will use random sampling to try different configuration sets of hyperparameters to maximize our primary metric, `Accuracy`.

In [11]:
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.parameter_expressions import choice
    

param_sampling = RandomParameterSampling( {
    "--max_features": choice('sqrt', 'log2'),
    "--n_estimators": choice(100, 200, 500)
    }
)

hyperdrive_config = HyperDriveConfig(estimator=estimator,
                                     hyperparameter_sampling=param_sampling, 
                                     primary_metric_name='MSE',
                                     primary_metric_goal=PrimaryMetricGoal.MINIMIZE,
                                     max_total_runs=12,
                                     max_concurrent_runs=12)

In [24]:
from azureml.pipeline.steps import HyperDriveStep, HyperDriveStepRun
from azureml.pipeline.core import Pipeline, PipelineData
from azureml.pipeline.steps.python_script_step import PythonScriptStep 

from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.runconfig import DEFAULT_CPU_IMAGE


metrics_output_name = 'metrics_output'
datastore = ws.get_default_datastore()

metrics_data = PipelineData(name='metrics_data',
                             datastore=datastore,
                             pipeline_output_name=metrics_output_name)

hd_step_name='hyper_param_training'

hd_step = HyperDriveStep(
    name=hd_step_name,
    hyperdrive_config=hyperdrive_config,
    estimator_entry_script_arguments=['--dataset_name', 'ISDWeatherDS'],
    metrics_output=metrics_data)

run_config = RunConfiguration()

# enable Docker 
run_config.environment.docker.enabled = True

# set Docker base image to the default CPU-based image
run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE

# use conda_dependencies.yml to create a conda environment in the Docker image for execution
run_config.environment.python.user_managed_dependencies = False

# specify CondaDependencies obj
run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn', 'pandas'], pip_packages = [ 'azureml-sdk','azureml-dataprep[pandas]'])


model_reg_step = PythonScriptStep(script_name = "test_register_model.py", name="test_register_model", 
                                  arguments=['--dataset_name', 'ISDWeatherDS','--hd_step_name', hd_step_name], 
                                  compute_target=compute_target, source_directory=project_folder, runconfig=run_config)

In [25]:
from azureml.pipeline.core import Pipeline, StepSequence

step_sequence = StepSequence(steps=[hd_step, model_reg_step])

pipeline = Pipeline(workspace=ws, steps=step_sequence)
pipeline_run = experiment.submit(pipeline)


Created step hyper_param_training [c0a88a38][4cc4e7c3-bff6-4c3d-8a32-05d358bdc767], (This step is eligible to reuse a previous run's output)Created step test_register_model [70392e58][2c6ded23-7dd4-42ef-b44c-5a42829663f2], (This step is eligible to reuse a previous run's output)

Submitted PipelineRun a142c6be-3f7b-45a2-849a-87151350cfab
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/train_isd/runs/a142c6be-3f7b-45a2-849a-87151350cfab?wsid=/subscriptions/0e9bace8-7a81-4922-83b5-d995ff706507/resourcegroups/azureml/workspaces/ws01ent


Finally, lauch the hyperparameter tuning job.

You can monitor the progress of the runs with the following Jupyter widget.

In [26]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()


_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': True, 'log_level': 'INFO', '…

In [60]:
step_run = pipeline_run.find_step_run(hd_step_name)[0]

In [64]:
step_run.get_details()

{'runId': '3002f270-c4a2-48fe-b147-cb99ee3b1b9e',
 'status': 'Completed',
 'startTimeUtc': '2020-10-26T19:24:18.391031Z',
 'endTimeUtc': '2020-10-26T19:24:18.509679Z',
 'properties': {'azureml.reusedrunid': '937ccf5a-2b8b-408c-aae6-138c9f944e88',
  'azureml.reusednodeid': 'bd3c83b3',
  'azureml.reusedpipeline': '09a079cf-75aa-4cfd-a3d2-d3876c45c486',
  'azureml.reusedpipelinerunid': '09a079cf-75aa-4cfd-a3d2-d3876c45c486',
  'azureml.runsource': 'azureml.StepRun',
  'azureml.nodeid': 'c0a88a38',
  'ContentSnapshotId': '6d06148d-7513-4185-bb05-811c73e9edd8',
  'StepType': 'HyperDriveStep',
  'ComputeTargetType': 'HyperDrive',
  'azureml.moduleid': '4cc4e7c3-bff6-4c3d-8a32-05d358bdc767',
  'azureml.pipelinerunid': 'a142c6be-3f7b-45a2-849a-87151350cfab'},
 'inputDatasets': [],
 'logFiles': {'logs/azureml/executionlogs.txt': 'https://ws01ent3218162019.blob.core.windows.net/azureml/ExperimentRun/dcid.937ccf5a-2b8b-408c-aae6-138c9f944e88/logs/azureml/executionlogs.txt?sv=2019-02-02&sr=b&sig=%

In [69]:
steps = []
for step in  pipeline_run.get_steps():
    steps.append(step)

In [74]:
HyperDriveStepRun(steps[1]).get_best_run_by_primary_metric()


ERROR - Cannot find hyperdrive run from the given step run: hyper_param_training


ValueError: Cannot find hyperdrive run from the given step run: hyper_param_training

In [51]:
hd_step_run = HyperDriveStepRun(step_run=pipeline_run.find_step_run(hd_step_name)[0])
best_run = hd_step_run.get_best_run_by_primary_metric()
best_run


IndexError: list index out of range

Now, let's list the model files uploaded during the run.

We can then register the folder (and all files in it) as a model named `sklearn-iris` under the workspace for deployment

In [38]:
model = best_run.register_model(model_name='isd_model_hyper_tuned', model_path='outputs/model.joblib')

In [35]:
best_run.download_file('outputs/model.joblib')