# Builder functions in SDK v2

**Motivations** 

- Builder functions are a convenient and flexible way to define jobs and components in Python.
- Builder functions make it easy to switch from a command job to a sweep job, from a command pipeline step to a sweep pipeline step.
- Builder functions allow the same syntax to compose stand alone jobs and pipeline jobs.

# 1. Define MLClient to connect to my workspace 

In [40]:
#import required libraries
from azure.ml import MLClient, dsl, command, Input, Output
from azure.ml.sweep import Uniform, Choice
from azure.identity import DefaultAzureCredential

ml_client = MLClient(DefaultAzureCredential(), "d128f140-94e6-4175-87a7-954b9d27db16", "ragargeastus2euap", "hanchi-test2")

# 2. A convenient way to create a command job using the `command()` function

In [46]:
train_job=command(
    code='./src',
    command = 'python main.py --iris-csv ${{inputs.iris_csv}} --learning-rate ${{inputs.learning_rate}} --boosting ${{inputs.boosting}} --output ${{outputs.my_model}}',
    environment='AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu@latest',
    inputs={
        'iris_csv': Input(type='uri_file', path='https://azuremlexamples.blob.core.windows.net/datasets/iris.csv'),
        'learning_rate': 0.9,
        'boosting': 'gbdt'},
    outputs={
        'my_model': Output(type='uri_file')
    },
    compute='cpu-cluster',
    display_name='lightgbm-iris-example',
    experiment_name='lightgbm-iris-example',
    description='Train a LightGBM model on the Iris dataset.'
)

In [47]:
ml_client.create_or_update(train_job)

Experiment,Name,Type,Status,Details Page
lightgbm-iris-example,elated_shark_1sttbd5v17,command,Starting,Link to Azure Machine Learning studio


# 3. Hyper parameter tuning made easy

In [33]:
# we are using the command_job above again by applying it inputs via calling it as a function
# note that we do not apply the 'iris_csv' input again -- that way the prior input from above will be used again
command_job_for_sweep = train_job(learning_rate=Uniform(min_value=0.01, max_value=0.9), 
                                    boosting=Choice(values=['gbdt', 'dart']))

sweep_job = command_job_for_sweep.sweep(
    compute='cpu-cluster',
    sampling_algorithm='random',
    primary_metric='test-multi_logloss',
    goal='Minimize'
)

sweep_job.display_name = "lightgbm-iris-sweep-example"
sweep_job.set_limits(max_total_trials=20, max_concurrent_trials=10, timeout=7200)

In [34]:
ml_client.create_or_update(sweep_job)

Experiment,Name,Type,Status,Details Page
1d_pipeline_with_non_python_components,ca76b928-e00f-44ee-a362-5e2a6cb4525e,sweep,Running,Link to Azure Machine Learning studio


# 4. Reuse the same code to compose pipeline steps

In [55]:
train_job=command(
    code='./src',
    command = 'python main.py --iris-csv ${{inputs.iris_csv}} --output ${{outputs.my_model}}',
    environment='AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu@latest',
    inputs={
        'iris_csv': Input(type='uri_file', path='https://azuremlexamples.blob.core.windows.net/datasets/iris.csv')
    },
    outputs={
        'my_model': Output(type='uri_file')
    },
    compute='cpu-cluster',
    display_name='lightgbm-iris-example',
    experiment_name='lightgbm-iris-example',
    description='Train a LightGBM model on the Iris dataset.'
)

In [56]:
download_url = "https://azuremlexamples.blob.core.windows.net/datasets/iris.csv"

# Create a command component to download a input data
download_data = command(
    display_name="Download Data",
    description="Download a input from remote URL and return it in output.",
    tags=dict(),
    command="curl ${{inputs.url}} > ${{outputs.output_file}}",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:1",
    inputs=dict(
        url=download_url # Default download URL
    ),
    outputs=dict(
        output_file=Output(type="uri_file", mode="rw_mount")
    )
)

# 5. Create the pipeline job using the components

In [57]:
compute = 'cpu-cluster'

@dsl.pipeline(
    description="The hello world pipeline job",
    tags={"owner": "sdkteam", "tag": "tagvalue"},
    default_compute=compute,
)
def pipeline_with_non_python_components(url):
    download_data_node = download_data(url=url)
    train_node = train_job(iris_csv=download_data_node.outputs.output_file)
    
    return {
        "model_file": train_node.outputs.my_model
    }

pipeline = pipeline_with_non_python_components(download_url)
print(pipeline)

name: upbeat_panda_nr6vmsvdpd
display_name: pipeline_with_non_python_components
description: The hello world pipeline job
type: pipeline
inputs:
  url: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
outputs:
  model_file: null
tags:
  owner: sdkteam
  tag: tagvalue
compute: azureml:cpu-cluster
experiment_name: 1d_pipeline_with_non_python_components
settings:
  force_rerun: false
properties: {}
jobs:
  download_data_node:
    $schema: '{}'
    type: command
    inputs:
      url: ${{parent.inputs.url}}
    outputs:
      output_file:
        mode: rw_mount
        type: uri_file
    command: curl ${{inputs.url}} > ${{outputs.output_file}}
    environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:1
    component:
      name: 7fae2fe1-db85-4a4e-bb8e-351df8122fc0
      version: '1'
      display_name: Download Data
      description: Download a input from remote URL and return it in output.
      type: command
      inputs:
        url:
          type: string
    

In [58]:
pipeline_job = ml_client.jobs.create_or_update(pipeline, experiment_name="pipeline_samples")

In [59]:
pipeline_job

Experiment,Name,Type,Status,Details Page
pipeline_samples,upbeat_panda_nr6vmsvdpd,pipeline,Preparing,Link to Azure Machine Learning studio


In [60]:
ml_client.components.get("4307b8d5-904b-419b-bd2b-0e8b25217d5a", "1")

ResourceNotFoundError: (UserError) Not found component 4307b8d5-904b-419b-bd2b-0e8b25217d5a.
Code: UserError
Message: Not found component 4307b8d5-904b-419b-bd2b-0e8b25217d5a.