# Builder functions in SDK v2

**Motivations** 

- Builder functions are a convenient and flexible way to define jobs and components in Python.
- Builder functions make it easy to switch from a command job to a sweep job, from a command pipeline step to a sweep pipeline step.
- Builder functions allow the same syntax to compose stand alone jobs and pipeline jobs.

# 1. Define MLClient to connect to my workspace 

In [21]:
#import required libraries
from azure.ml import MLClient, dsl, command, Input, Output
from azure.ml.entities import Environment, BuildContext
from azure.ml.sweep import Uniform
from azure.identity import DefaultAzureCredential

ml_client = MLClient(DefaultAzureCredential(), "d128f140-94e6-4175-87a7-954b9d27db16", "ragargeastus2euap", "hanchi-test2")

# 2. A convenient way to create a command job using the `command()` function

In [22]:
train_data_with_r = command(
    description="Train data with R.",
    command="Rscript train.R --data_folder ${{inputs.iris}}",
    environment=Environment(build=BuildContext(path="docker_context")),
    code="./src",
    inputs=dict(
        iris=Input(type="uri_file", path="https://azuremlexamples.blob.core.windows.net/datasets/iris.csv"),
        lr=0.01
    ),
    compute="cpu-cluster"
)

In [23]:
ml_client.create_or_update(train_data_with_r)

Experiment,Name,Type,Status,Details Page
1d_pipeline_with_non_python_components,icy_roti_mp4zl17fyf,command,Starting,Link to Azure Machine Learning studio


# 3. Hyper parameter tuning made easy

In [24]:
# train_data_with_r can be executed to apply different inputs
node_to_sweep = train_data_with_r(lr=Uniform(min_value=0.01, max_value=0.9))

sweep_job = node_to_sweep.sweep(
    sampling_algorithm='random',
    primary_metric='test-multi_logloss',
    goal='Minimize'
)

sweep_job.display_name = "lightgbm-iris-sweep-example"
sweep_job.set_limits(max_total_trials=20, max_concurrent_trials=10, timeout=7200)

In [25]:
ml_client.create_or_update(sweep_job)

Experiment,Name,Type,Status,Details Page
1d_pipeline_with_non_python_components,sharp_plate_fzthlrqmlz,sweep,Running,Link to Azure Machine Learning studio


# 4. Reuse the same code to compose pipeline steps

In [29]:
download_url = "https://azuremlexamples.blob.core.windows.net/datasets/iris.csv"
environment = "AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:1"

# 1. Create a command component to download a input data
download_data = command(
    display_name="Download Data",
    description="Download a input from remote URL and return it in output.",
    tags=dict(),
    command="curl ${{inputs.url}} > ${{outputs.output_file}}",
    environment=environment,
    inputs=dict(
        url=download_url # Default download URL
    ),
    outputs=dict(
        output_file=Output(type="uri_file", mode="rw_mount")
    )
)

# 5. Create the pipeline job using the components

In [32]:
compute = 'cpu-cluster'

@dsl.pipeline(
    description="The hello world pipeline job",
    tags={"owner": "sdkteam", "tag": "tagvalue"},
    default_compute=compute,
)
def pipeline_with_non_python_components(url):
    download_data_node = download_data(url=url)
    train_node = train_data_with_r(iris=download_data_node.outputs.output_file)

pipeline = pipeline_with_non_python_components(download_url)
print(pipeline)

name: goofy_ocean_mhqtdmdmnc
display_name: pipeline_with_non_python_components
description: The hello world pipeline job
type: pipeline
inputs:
  url: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
outputs: {}
tags:
  owner: sdkteam
  tag: tagvalue
compute: azureml:cpu-cluster
properties: {}
jobs:
  download_data_node:
    $schema: '{}'
    type: command
    inputs:
      url: ${{parent.inputs.url}}
    outputs:
      output_file:
        mode: rw_mount
        type: uri_file
    command: curl ${{inputs.url}} > ${{outputs.output_file}}
    environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:1
    code: /subscriptions/d128f140-94e6-4175-87a7-954b9d27db16/resourceGroups/ragargeastus2euap/providers/Microsoft.MachineLearningServices/workspaces/hanchi-test2/codes/df9605e9-a11d-4f66-b520-54a2aae24694/versions/1
    environment_variables: {}
    component:
      name: 07de804c-e1e6-8d74-2e44-a6750b179343
      version: '1'
      display_name: Download Data
      de

In [None]:
pipeline_job = ml_client.jobs.create_or_update(pipeline, experiment_name="pipeline_samples")