Skip to content

Files

Latest commit

 

History

History
91 lines (55 loc) · 6.35 KB

how-to-use-sweep-in-pipeline.md

File metadata and controls

91 lines (55 loc) · 6.35 KB
title titleSuffix description services ms.service ms.subservice ms.topic author ms.author ms.reviewer ms.date ms.custom
How to do hyperparameter sweep in pipeline
Azure Machine Learning
How to use sweep to do hyperparameter tuning in Azure Machine Learning pipeline using CLI v2 and Python SDK
machine-learning
machine-learning
mlops
how-to
lgayhardt
lagayhar
zhanxia
05/26/2022
devx-track-python, sdkv2, cliv2, update-code1

How to do hyperparameter tuning in pipeline (v2)

[!INCLUDE dev v2]

In this article, you'll learn how to do hyperparameter tuning in Azure Machine Learning pipeline.

Prerequisite

  1. Understand what is hyperparameter tuning and how to do hyperparameter tuning in Azure Machine Learning use SweepJob.
  2. Understand what is a Azure Machine Learning pipeline
  3. Build a command component that takes hyperparameter as input.

How to do hyperparameter tuning in Azure Machine Learning pipeline

This section explains how to do hyperparameter tuning in Azure Machine Learning pipeline using CLI v2 and Python SDK. Both approaches share the same prerequisite: you already have a command component created and the command component takes hyperparameters as inputs. If you don't have a command component yet. Follow below links to create a command component first.

CLI v2

The example used in this article can be found in azureml-example repo. Navigate to [azureml-examples/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep to check the example.

Assume you already have a command component defined in train.yaml. A two-step pipeline job (train and predict) YAML file looks like below.

:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/pipeline.yml" highlight="7-48":::

The sweep_step is the step for hyperparameter tuning. Its type needs to be sweep. And trial refers to the command component defined in train.yaml. From the search space field we can see three hyparmeters (c_value, kernel, and coef) are added to the search space. After you submit this pipeline job, Azure Machine Learning will run the trial component multiple times to sweep over hyperparameters based on the search space and terminate policy you defined in sweep_step. Check sweep job YAML schema for full schema of sweep job.

Below is the trial component definition (train.yml file).

:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/train.yml" highlight="11-16,23-25,60":::

The hyperparameters added to search space in pipeline.yml need to be inputs for the trial component. The source code of the trial component is under ./train-src folder. In this example, it's a single train.py file. This is the code that will be executed in every trial of the sweep job. Make sure you've logged the metrics in the trial component source code with exactly the same name as primary_metric value in pipeline.yml file. In this example, we use mlflow.autolog(), which is the recommended way to track your ML experiments. See more about mlflow here

Below code snippet is the source code of trial component.

:::code language="python" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/train-src/train.py" highlight="15":::

Python SDK

The Python SDK example can be found in azureml-example repo. Navigate to azureml-examples/sdk/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep to check the example.

In Azure Machine Learning Python SDK v2, you can enable hyperparameter tuning for any command component by calling .sweep() method.

Below code snippet shows how to enable sweep for train_model.

[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep/pipeline_with_hyperparameter_sweep.ipynb?name=enable-sweep)]

We first load train_component_func defined in train.yml file. When creating train_model, we add c_value, kernel and coef0 into search space(line 15-17). Line 30-35 defines the primary metric, sampling algorithm etc.

Check pipeline job with sweep step in Studio

After you submit a pipeline job, the SDK or CLI widget will give you a web URL link to Studio UI. The link will guide you to the pipeline graph view by default.

To check details of the sweep step, double click the sweep step and navigate to the child job tab in the panel on the right.

:::image type="content" source="./media/how-to-use-sweep-in-pipeline/pipeline-view.png" alt-text="Screenshot of the pipeline with child job and the train_model node highlighted." lightbox= "./media/how-to-use-sweep-in-pipeline/pipeline-view.png":::

This will link you to the sweep job page as seen in the below screenshot. Navigate to child job tab, here you can see the metrics of all child jobs and list of all child jobs.

:::image type="content" source="./media/how-to-use-sweep-in-pipeline/sweep-job.png" alt-text="Screenshot of the job page on the child jobs tab." lightbox= "./media/how-to-use-sweep-in-pipeline/sweep-job.png":::

If a child jobs failed, select the name of that child job to enter detail page of that specific child job (see screenshot below). The useful debug information is under Outputs + Logs.

:::image type="content" source="./media/how-to-use-sweep-in-pipeline/child-run.png" alt-text="Screenshot of the output + logs tab of a child run." lightbox= "./media/how-to-use-sweep-in-pipeline/child-run.png":::

Sample notebooks

Next steps