# Hyperparameter Tuning pipeline examples

In this example, we'll build a pipeline for Hyperparameter tuning. This pipeline will test multiple hyperparameter permutations and then register the best model.

**Note:** This example requires that you've ran the notebook from the first tutorial, so that the dataset and compute cluster are set up.

In [1]:
import os
import azureml.core
from azureml.core import Workspace, Experiment, Dataset, RunConfiguration
from azureml.pipeline.core import Pipeline, PipelineData
from azureml.pipeline.steps import PythonScriptStep, HyperDriveStep, HyperDriveStepRun
from azureml.data.dataset_consumption_config import DatasetConsumptionConfig
from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveConfig, PrimaryMetricGoal
from azureml.train.hyperdrive import choice, loguniform, uniform
from azureml.core import ScriptRunConfig

print("Azure ML SDK version:", azureml.core.VERSION)

Azure ML SDK version: 1.20.0


First, we will connect to the workspace. The command `Workspace.from_config()` will either:
* Read the local `config.json` with the workspace reference (given it is there) or
* Use the `az` CLI to connect to the workspace and use the workspace attached to via `az ml folder attach -g <resource group> -w <workspace name>`

In [2]:
ws = Workspace.from_config()
print(f'WS name: {ws.name}\nRegion: {ws.location}\nSubscription id: {ws.subscription_id}\nResource group: {ws.resource_group}')

WS name: demo-ent-ws
Region: westeurope
Subscription id: bcbf34a7-1936-4783-8840-8f324c37f354
Resource group: demo


# Preparation

Let's reference the dataset from the first tutorial:

In [3]:
training_dataset = Dataset.get_by_name(ws, "german-credit-train-tutorial")
training_dataset_consumption = DatasetConsumptionConfig("training_dataset", training_dataset).as_download()

Here, we define the parameter sampling (defines the search space for our hyperparameters we want to try), early termination policy (allows to kill poorly performing runs early), then we put this togehter as a `HyperDriveConfig` and execute it in an `HyperDriveStep`. Lastly, we have a short step to register the best model.

In [10]:
runconfig = RunConfiguration.load("runconfig.yml")
script_run_config = ScriptRunConfig(source_directory="./",
                                    run_config=runconfig)
script_run_config.data_references = None

ps = RandomParameterSampling(
    {
        '--c': uniform(0.1, 1.9)
    }
)
early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)

hd_config = HyperDriveConfig(run_config=script_run_config, 
                             hyperparameter_sampling=ps,
                             policy=early_termination_policy,
                             primary_metric_name='Test accuracy', 
                             primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, 
                             max_total_runs=4,
                             max_concurrent_runs=1)

hd_step = HyperDriveStep(name='hyperparameter-tuning',
                         hyperdrive_config=hd_config,
                         estimator_entry_script_arguments=['--data-path', training_dataset_consumption],
                         inputs=[training_dataset_consumption],
                         outputs=None)

register_step = PythonScriptStep(script_name='register.py',
                                 runconfig=runconfig,
                                 name="register-model",
                                 compute_target="cluster",
                                 arguments=['--model_name', 'best_model'],
                                 allow_reuse=False)

# Explicitly state that registration runs after training, as there is not direct dependency through inputs/outputs
register_step.run_after(hd_step)

steps = [hd_step, register_step]

Finally, we can create our pipeline object and validate it. This will check the input and outputs are properly linked and that the pipeline graph is a non-cyclic graph:

In [11]:
pipeline = Pipeline(workspace=ws, steps=steps, description="HyperDrive Pipeline")
pipeline.validate()

Step hyperparameter-tuning is ready to be created [7d092c7b]
Step register-model is ready to be created [f65584f0]


[]

Lastly, we can submit the pipeline against an experiment:

In [12]:
pipeline_run = Experiment(ws, 'mlops-workshop-pipelines').submit(pipeline)

Created step hyperparameter-tuning [7d092c7b][35de8035-5f0d-48bc-8b0a-84e78cb0594e], (This step will run and generate new outputs)
Created step register-model [f65584f0][aa9c2c5e-84a4-4f6e-8d04-c0b5ab06edd2], (This step will run and generate new outputs)
Submitted PipelineRun 571b74c2-6fd4-4bfe-bf79-005f53b9be8b
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/mlops-workshop-pipelines/runs/571b74c2-6fd4-4bfe-bf79-005f53b9be8b?wsid=/subscriptions/bcbf34a7-1936-4783-8840-8f324c37f354/resourcegroups/demo/workspaces/demo-ent-ws


In [13]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', …

In [14]:
pipeline_run.wait_for_completion()

PipelineRunId: 571b74c2-6fd4-4bfe-bf79-005f53b9be8b
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/mlops-workshop-pipelines/runs/571b74c2-6fd4-4bfe-bf79-005f53b9be8b?wsid=/subscriptions/bcbf34a7-1936-4783-8840-8f324c37f354/resourcegroups/demo/workspaces/demo-ent-ws
PipelineRun Status: Running


StepRunId: 30a9425f-fca5-4f1e-84ba-59c92b4a3e5d
Link to Azure Machine Learning Portal: https://ml.azure.com/experiments/mlops-workshop-pipelines/runs/30a9425f-fca5-4f1e-84ba-59c92b4a3e5d?wsid=/subscriptions/bcbf34a7-1936-4783-8840-8f324c37f354/resourcegroups/demo/workspaces/demo-ent-ws
StepRun( hyperparameter-tuning ) Status: NotStarted
StepRun( hyperparameter-tuning ) Status: Running

StepRun(hyperparameter-tuning) Execution Summary
StepRun( hyperparameter-tuning ) Status: Finished
{'runId': '30a9425f-fca5-4f1e-84ba-59c92b4a3e5d', 'status': 'Completed', 'startTimeUtc': '2021-01-19T17:07:08.358896Z', 'endTimeUtc': '2021-01-19T17:13:55.626544Z', 'properties': {'azureml.ru


Streaming azureml-logs/70_driver_log.txt
bash: /azureml-envs/azureml_49a60276f107e2ecad8c77bf12c460f1/lib/libtinfo.so.5: no version information available (required by bash)
bash: /azureml-envs/azureml_49a60276f107e2ecad8c77bf12c460f1/lib/libtinfo.so.5: no version information available (required by bash)
2021/01/19 17:14:32 Attempt 1 of http call to http://10.0.0.5:16384/sendlogstoartifacts/info
2021/01/19 17:14:32 Attempt 1 of http call to http://10.0.0.5:16384/sendlogstoartifacts/status
[2021-01-19T17:14:33.647286] Entering context manager injector.
[context_manager_injector.py] Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath', 'RunHistory:context_managers.RunHistory', 'TrackUserError:context_managers.TrackUserError'], invocation=['register.py', '--model_name', 'best_model'])
Script type = None
Starting the daemon thread to refresh tokens in background for process with pid = 108
Entering Run History Context Manager.
[2021-01-19T17:14:38.5



PipelineRun Execution Summary
PipelineRun Status: Finished
{'runId': '571b74c2-6fd4-4bfe-bf79-005f53b9be8b', 'status': 'Completed', 'startTimeUtc': '2021-01-19T17:06:49.984857Z', 'endTimeUtc': '2021-01-19T17:15:37.604181Z', 'properties': {'azureml.runsource': 'azureml.PipelineRun', 'runSource': 'SDK', 'runType': 'SDK', 'azureml.parameters': '{}'}, 'inputDatasets': [], 'outputDatasets': [], 'logFiles': {'logs/azureml/executionlogs.txt': 'https://demoentws5367325393.blob.core.windows.net/azureml/ExperimentRun/dcid.571b74c2-6fd4-4bfe-bf79-005f53b9be8b/logs/azureml/executionlogs.txt?sv=2019-02-02&sr=b&sig=s%2FStopA4yERMNjoPzQEBwn53I1dI5%2F7f3LO9UnnoA3c%3D&st=2021-01-19T16%3A57%3A10Z&se=2021-01-20T01%3A07%3A10Z&sp=r', 'logs/azureml/stderrlogs.txt': 'https://demoentws5367325393.blob.core.windows.net/azureml/ExperimentRun/dcid.571b74c2-6fd4-4bfe-bf79-005f53b9be8b/logs/azureml/stderrlogs.txt?sv=2019-02-02&sr=b&sig=p8dWVx1o0vI%2FB3l6tlVyH3GD037kOTRnfStS%2BOMSOx8%3D&st=2021-01-19T16%3A57%3A10Z

'Finished'