Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/ml-frameworks/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.png)

# Train and hyperparameter tune with Scikit-learn

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters 

- In this tutorial, we demonstrate how to use the Azure ML Python SDK to train a support vector machine (SVM) on a single-node CPU with Scikit-learn to perform classification on the popular [Iris dataset](https://archive.ics.uci.edu/ml/datasets/iris). We will also demonstrate how to perform hyperparameter tuning of the model using Azure ML's HyperDrive service.



## Part I - Creating single thread training run

We can run training job as one of the options:  
- SKLearn 
- PythonScriptStep
- ScriptRunConfig 

For HyperParameterTuning we have to use 
- ScriptRunConfig
-- Create a ScriptRunConfig object to specify the configuration details of your training job, including your training script, environment to use, and the compute target to run on.


## Part II - Create parallel thread hyperparameter tuning run

We can optimize our model's hyperparameters using Azure Machine Learning's hyperparameter tuning capabilities.

We are using 
- HyperDriveConfig

# Prerequisites

* Go through the [Configuration](../../../../configuration.ipynb) notebook to install the Azure Machine Learning Python SDK and create an Azure ML Workspace

In [None]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)



## Diagnostics

Opt-in diagnostics for better experience, quality, and security of future releases.

In [None]:
from azureml.telemetry import set_diagnostics_collection

set_diagnostics_collection(send_diagnostics=True)

## Initialize workspace

Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`.

In [None]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

## Create AmlCompute

You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource.

As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.

If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_D2_V2` CPU VMs. This process is broken down into 3 steps:
1. create the configuration (this step is local and only takes a second)
2. create the cluster (this step will take about **20 seconds**)
3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell

In [None]:
# Attache Azure ML Compute as Cluster of low cost nodes
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
import os

# choose a name for your cluster
compute_name = os.environ.get("AML_COMPUTE_CLUSTER_NAME", "automl-compute")
compute_min_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MIN_NODES", 0)
compute_max_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MAX_NODES", 4)

# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6
vm_size = os.environ.get("AML_COMPUTE_CLUSTER_SKU", "STANDARD_D2_V2")


if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print('found compute target. just use it. ' + compute_name)
else:
    print('creating a new compute target...')
    provisioning_config = AmlCompute.provisioning_configuration(vm_size=vm_size,
                                                                min_nodes=compute_min_nodes,
                                                                max_nodes=compute_max_nodes)

    # create the cluster
    compute_target = ComputeTarget.create(
        ws, compute_name, provisioning_config)

    # can poll for a minimum number of nodes and for a specific timeout.
    # if no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(
        show_output=True, min_node_count=None, timeout_in_minutes=20)

    # For a more detailed view of current AmlCompute status, use get_status()
    print(compute_target.get_status().serialize())

The above code retrieves a CPU compute target. Scikit-learn does not support GPU computing.

## Train model on the remote or local compute

Now that you have your data and training script prepared, you are ready to train on your remote compute. You can take advantage of Azure compute to leverage a CPU cluster.

### Create a project directory

Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on.

In [None]:
import os
script_file = "train_iris.py"
script_dir = './code'

os.makedirs(script_dir, exist_ok=True)

### Prepare training script

Now you will need to create your training script. In this tutorial, the training script is already provided for you at `train_iris`.py. In practice, you should be able to take any custom training script as is and run it with Azure ML without having to modify your code.

However, if you would like to use Azure ML's [tracking and metrics](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#metrics) capabilities, you will have to add a small amount of Azure ML code inside your training script.

In `train_iris.py`, we will log some metrics to our Azure ML run. To do so, we will access the Azure ML Run object within the script:

```python
from azureml.core.run import Run
run = Run.get_context()
```

Further within `train_iris.py`, we log the kernel and penalty parameters, and the highest accuracy the model achieves:

```python
run.log('Kernel type', np.string(args.kernel))
run.log('Penalty', np.float(args.penalty))

run.log('Accuracy', np.float(accuracy))
```

These run metrics will become particularly important when we begin hyperparameter tuning our model in the "Tune model hyperparameters" section.

Once your script is ready, copy the training script `train_iris.py` into your project directory.

In [None]:
%%writefile ./code/train_iris.py
# Modified from https://www.geeksforgeeks.org/multiclass-classification-using-scikit-learn/
# Nov 2020

import argparse
import os

# importing necessary libraries
import numpy as np

from sklearn import datasets
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split

import joblib

from azureml.core.run import Run
run = Run.get_context()


def main():
    parser = argparse.ArgumentParser()

    parser.add_argument('--kernel', type=str, default='linear',
                        help='Kernel type to be used in the algorithm')
    parser.add_argument('--penalty', type=float, default=1.0,
                        help='Penalty parameter of the error term')

    args = parser.parse_args()
    run.log('Kernel type', np.str(args.kernel))
    run.log('Penalty', np.float(args.penalty))

    # loading the iris dataset
    iris = datasets.load_iris()

    # X -> features, y -> label
    X = iris.data
    y = iris.target

    # dividing X, y into train and test data
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

    # training a linear SVM classifier
    from sklearn.svm import SVC
    svm_model_linear = SVC(kernel=args.kernel, C=args.penalty).fit(X_train, y_train)
    svm_predictions = svm_model_linear.predict(X_test)

    # model accuracy for X_test
    accuracy = svm_model_linear.score(X_test, y_test)
    print('Accuracy of SVM classifier on test set: {:.2f}'.format(accuracy))
    run.log('Accuracy', np.float(accuracy))
    # creating a confusion matrix
    cm = confusion_matrix(y_test, svm_predictions)
    print(cm)

    os.makedirs('outputs', exist_ok=True)
    # files saved in the "outputs" folder are automatically uploaded into run history
    joblib.dump(svm_model_linear, 'outputs/model.joblib')


if __name__ == '__main__':
    main()


In [None]:
# we check file content TRAINING - linear model

script_file = "train_iris.py"
script_dir = './code'

# peek at contents
with open(os.path.join(script_dir, script_file)) as inference_file:
    print(inference_file.read())

### Create an experiment

Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this Scikit-learn tutorial.

In [None]:
from azureml.core import Experiment

experiment_name = 'train_iris'
experiment = Experiment(ws, name=experiment_name)

### Create an environment

Define a conda environment YAML file with your training script dependencies and create an Azure ML environment.

In [None]:
# Python version
os.sys.version_info

# Creating run

## Part I - Creating single thread training run

We can run training job as one of the options:  
- SKLearn 
- PythonScriptStep
- ScriptRunConfig 

For HyperParameterTuning we have to use 
- ScriptRunConfig
-- Create a ScriptRunConfig object to specify the configuration details of your training job, including your training script, environment to use, and the compute target to run on.

### Version - SKLearn

In [None]:
# Creating estimator SKLEARN

# Working locally or Globally 
# Azure ML will create for me docker image 
from azureml.train.estimator import Estimator
from azureml.train.sklearn import SKLearn
from azureml.widgets import RunDetails

script_file = "train_iris.py"
script_dir = './code'

experiment = Experiment(workspace=ws, name="iris_sklearn")

est = SKLearn(source_directory=script_dir,
                entry_script=script_file,
                #Cannot pass arguments:
                #arguments=['--kernel', 'linear', '--penalty', 1.0], 
              
                # pass dataset object as an input with name 'titanic'
                #CORRECT inputs=[therealbank_header.as_named_input('therealbank_header'),therealbank_data.as_named_input('therealbank_data')],
                
                # pass libraries
                #CORRECT 
                pip_packages = ['azureml-sdk','azureml-dataprep[fuse,pandas]','matplotlib'],
                # No need to pass other packages
                #conda_packages=['azureml-sdk','numpy','scikit-learn'],
                
                #WORKS correctly  
                compute_target='local'
                #Wroks correctly 
                #compute_target=compute_target
               )

# Submit the estimator as part of your experiment run

experiment_run = experiment.submit(est)

RunDetails(experiment_run).show()

#experiment_run = experiment.submit(est)
experiment_run.wait_for_completion(show_output=True)

### Version - PythonScriptStep

In [None]:
# python script configuration

from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies

# Create a new runconfig object
aml_run_config = RunConfiguration()

# Use the aml_compute you created above. 
aml_run_config.target = compute_target

# Enable Docker
aml_run_config.environment.docker.enabled = True

# Use conda_dependencies.yml to create a conda environment in the Docker image for execution
aml_run_config.environment.python.user_managed_dependencies = False

# Specify CondaDependencies obj, add necessary packages
aml_run_config.environment.python.conda_dependencies = CondaDependencies.create(
    conda_packages=['pandas','scikit-learn','matplotlib'], 
    pip_packages=['azureml-sdk', 'azureml-dataprep[fuse,pandas]'], 
    pin_sdk_version=False)

In [None]:
# python step
from azureml.pipeline.steps import PythonScriptStep, EstimatorStep

script_file = "train_iris.py"
script_dir = './code'

helen_prep_step1 = PythonScriptStep(name='iris_python',
                             script_name=script_file,
                             source_directory=script_dir,
                             arguments=['--kernel', 'linear', '--penalty', 1.0],       
                             # Cannot work locally compute_target='local',
                             # Works correctly 
                             compute_target=compute_target,
                             runconfig=aml_run_config,
                             allow_reuse=False)

In [None]:
# run pipeline with one step
from azureml.core import Experiment
from azureml.pipeline.core import Pipeline

pipeline = Pipeline(workspace=ws, steps=[helen_prep_step1])

pipeline_run = Experiment(ws, 'iris_python_pipeline').submit(pipeline)

# this will output a table with link to the run details in azure portal
pipeline_run
#Console logs
pipeline_run.wait_for_completion(show_output=True)

### Version ScriptRunConfig

In [None]:
%%writefile conda_dependencies.yml

dependencies:
- python=3.6.2
- scikit-learn
- pip:
  - azureml-defaults

In [None]:
from azureml.core import Environment

sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './conda_dependencies.yml')
sklearn_env.docker.enabled = True
sklearn_env.python.user_managed_dependencies = False

In [None]:
# THIS WORKED
from azureml.core import ScriptRunConfig

script_file = "train_iris.py"
script_dir = './code'

experiment = Experiment(workspace=ws, name="iris_ScriptRunConfig")

est = ScriptRunConfig(source_directory=script_dir,
                      script=script_file,
                      arguments=['--kernel', 'linear', '--penalty', 1.0])

# Submit the estimator as part of your experiment run
est.run_config.target=compute_target

# Correct: Submit Local
# est.run_config.target='local'
est.run_config.environment=sklearn_env

experiment_run = experiment.submit(est)

RunDetails(experiment_run).show()

#experiment_run = experiment.submit(est)
experiment_run.wait_for_completion(show_output=True)

## Part II - Create parallel thread hyperparameter tuning run

Now that we've seen how to do a simple Scikit-learn training run using the SDK, let's see if we can further improve the accuracy of our model. We can optimize our model's hyperparameters using Azure Machine Learning's hyperparameter tuning capabilities.

We are using 
- HyperDriveConfig

### Start a hyperparameter sweep

First, we will define the hyperparameter space to sweep over. Let's tune the `kernel` and `penalty` parameters. In this example we will use random sampling to try different configuration sets of hyperparameters to maximize our primary metric, `Accuracy`.

In [None]:
from azureml.core import Environment

sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './conda_dependencies.yml')
sklearn_env.docker.enabled = True
sklearn_env.python.user_managed_dependencies = False

In [None]:
from azureml.core import ScriptRunConfig

script_file = "train_iris.py"
script_dir = './code'

experiment = Experiment(workspace=ws, name="iris_ScriptRunConfig")



est = ScriptRunConfig(source_directory=script_dir,
                      script=script_file,
                      arguments=['--kernel', 'linear', '--penalty', 1.0])

est.run_config.target=compute_target
# PS Local is not supported 
est.run_config.environment=sklearn_env

In [None]:
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.parameter_expressions import choice
    

param_sampling = RandomParameterSampling( {
    "--kernel": choice('linear', 'rbf', 'poly', 'sigmoid'),
    "--penalty": choice(0.5, 1, 1.5)
    }
)

hyperdrive_config = HyperDriveConfig(run_config=est,
                                     hyperparameter_sampling=param_sampling, 
                                     primary_metric_name='Accuracy',
                                     primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                                     max_total_runs=12,
                                     max_concurrent_runs=4)

Finally, lauch the hyperparameter tuning job.

In [None]:
# start the HyperDrive run
hyperdrive_run = experiment.submit(hyperdrive_config)

## Monitor HyperDrive runs

You can monitor the progress of the runs with the following Jupyter widget.

In [None]:
RunDetails(hyperdrive_run).show()

In [None]:
hyperdrive_run.wait_for_completion(show_output=True)

In [None]:
assert(hyperdrive_run.get_status() == "Completed")

# Improving training run time 

### Warm start a Hyperparameter Tuning experiment and resuming child runs
Often times, finding the best hyperparameter values for your model can be an iterative process, needing multiple tuning runs that learn from previous hyperparameter tuning runs. Reusing knowledge from these previous runs will accelerate the hyperparameter tuning process, thereby reducing the cost of tuning the model and will potentially improve the primary metric of the resulting model. When warm starting a hyperparameter tuning experiment with Bayesian sampling, trials from the previous run will be used as prior knowledge to intelligently pick new samples, so as to improve the primary metric. Additionally, when using Random or Grid sampling, any early termination decisions will leverage metrics from the previous runs to determine poorly performing training runs. 

Azure Machine Learning allows you to warm start your hyperparameter tuning run by leveraging knowledge from up to 5 previously completed hyperparameter tuning parent runs. 

Additionally, there might be occasions when individual training runs of a hyperparameter tuning experiment are cancelled due to budget constraints or fail due to other reasons. It is now possible to resume such individual training runs from the last checkpoint (assuming your training script handles checkpoints). Resuming an individual training run will use the same hyperparameter configuration and mount the storage used for that run. The training script should accept the "--resume-from" argument, which contains the checkpoint or model files from which to resume the training run. You can also resume individual runs as part of an experiment that spends additional budget on hyperparameter tuning. Any additional budget, after resuming the specified training runs is used for exploring additional configurations.

For more information on warm starting and resuming hyperparameter tuning runs, please refer to the [Hyperparameter Tuning for Azure Machine Learning documentation](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters) 



# Find and register best model
When all jobs finish, we can find out the one that has the highest accuracy.

In [None]:
best_run = hyperdrive_run.get_best_run_by_primary_metric()
print(best_run.get_details()['runDefinition']['arguments'])

Now, let's list the model files uploaded during the run.

In [None]:
print(best_run.get_file_names())

We can then register the folder (and all files in it) as a model named `sklearn-iris` under the workspace for deployment

In [None]:
model = best_run.register_model(model_name='sklearn-iris', model_path='outputs/model.joblib')