# Build pipeline with sweep node

**Requirements** - In order to benefit from this tutorial, you will need:
- A basic understanding of Machine Learning
- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- An Azure ML workspace. [Check this notebook for creating a workspace](/sdk/resources/workspace/workspace.ipynb) 
- A Compute Cluster. [Check this notebook to create a compute cluster](/sdk/resources/compute/compute.ipynb)
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](/sdk/README.md#getting-started)

**Learning Objectives** - By the end of this tutorial, you should be able to:
- Connect to your AML workspace from the Python SDK
- Create sweep node with `commandComponent.sweep()`
- Create `pipeline` with sweep node

**Motivations** - This notebook explains how to create a sweep node by using `commandComponent.sweep()` and use it in a pipeline. A sweep node can be used to enable hyperparameter tuning on a specified compute (either local or on the cloud) for a specific command component. The command compoonent accepts `environment` to setup required infrastructure. You can define a `search_space` and an `objective` to search for the target output.  

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1. Import the required libraries

In [None]:
#import required libraries
from azure.identity import InteractiveBrowserCredential
from azure.ml import MLClient, dsl

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [interactive authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.interactivebrowsercredential?view=azure-python) for this tutorial. More advanced connection methods can be found [here](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).

In [None]:
#Enter details of your AML workspace
subscription_id = '<SUBSCRIPTION_ID>'
resource_group = '<RESOURCE_GROUP>'
workspace = '<AML_WORKSPACE_NAME>'

In [None]:
#get a handle to the workspace
ml_client = MLClient(InteractiveBrowserCredential(), subscription_id, resource_group, workspace)

# 2. Create command component
We defined a command component as trial component in [dsl_component.py](dsl_component.py), which trains a neural network for MNIST image classification with TensorFlow. The component function is train.


In [None]:
%load_ext autoreload
%autoreload 2

from dsl_component import tf_func
help(tf_func)

# 3. Pipeline job with sweep node

## 3.1 Build pipeline
Enable hyperparameter tuning for a normal command component by component.sweep().

In [None]:
from azure.ml import dsl
from pathlib import Path
from dsl_component import tf_func
from azure.ml.entities import BanditPolicy, Choice


def generate_dsl_pipeline_with_sweep_node():
    @dsl.pipeline(
        description="Tune hyperparameters using TF component",
    )
    def sample_pipeline():
        tf_job = tf_func(
            epochs=Choice([1, 2, 3]),
            steps_per_epoch=70,
            per_worker_batch_size=64
        )
        tf_job.outputs.trained_model_output.mode = "upload"
        
        sweep_job = tf_job.sweep(
            objective_primary_metric="accuracy",
            objective_goal="maximize",
            sampling_algorithm="random",
            compute="cpu-cluster",
        )
        sweep_job.set_limits(max_total_trials=2, max_concurrent_trials=3, timeout=600)
        sweep_job.early_termination = BanditPolicy(evaluation_interval=2, slack_factor=0.1, delay_evaluation=1)

    pipeline = sample_pipeline()
    return pipeline

pipeline = generate_dsl_pipeline_with_sweep_node()

## 3.2 Submit pipeline job with sweep node

In [None]:
# submit job to workspace
pipeline_job = ml_client.jobs.create_or_update(pipeline, experiment_name="tf_mnist_sweep")
print(f'Job link: {pipeline_job.services["Studio"].endpoint}')

In [None]:
# Wait until the job completes
# ml_client.jobs.stream(returned_job.name)

# Next Steps
You can see further examples of running a pipeline job [here](/sdk/jobs/pipelines/)