# Build Pipeline with Components from yaml

**Requirements** - In order to benefit from this tutorial, you will need:
- A basic understanding of Machine Learning
- An Azure account with an active subscription - [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- An Azure ML workspace with computer cluster - [Configure workspace](../../configuration.ipynb)
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../../README.md) - check the getting started section

**Learning Objectives** - By the end of this tutorial, you should be able to:
- Connect to your AML workspace from the Python SDK
- Define and load `CommandComponent` from YAML or load `CommandComponent` from `mldesigner`
- Create `Pipeline` using loaded component.

**Motivations** - This notebook covers the scenario that user define components using yaml then use these components to build pipeline.

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

### 1.1 Import the required libraries

In [None]:
# Import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

from azure.ai.ml import MLClient, Input
from azure.ai.ml.dsl import pipeline
from azure.ai.ml import load_component

## 1.2 Configure credential

We are using `DefaultAzureCredential` to get access to workspace. 
`DefaultAzureCredential` should be capable of handling most Azure SDK authentication scenarios. 

Reference for more available credentials if it does not work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).

In [None]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

## 1.3 Get a handle to the workspace

We use config file to connect to a workspace. The Azure ML workspace should be configured with computer cluster. [Check this notebook for configure a workspace](../../configuration.ipynb)

In [None]:
# Get a handle to workspace
ml_client = MLClient.from_config(credential=credential)

# Retrieve an already attached Azure Machine Learning Compute.
cluster_name = "cpu-cluster"
print(ml_client.compute.get(cluster_name))

# Define and Load component from YAML
(Define primitive output in YAML)

### 2.1 Load components from YAML

In [None]:
command_component_fun = load_component(source="./command_component_with_primitive_output.yaml")

In [None]:
print(command_component_fun)

### 2.2 Build pipeline

In [None]:
@pipeline()
def pipeline_with_components_use_primitive_output():
    component_node1 = command_component_fun(input_int=1, input_float=1.1,
                                            input_str='str', input_bool=True)
    
    # Please note that we need to write the input of the previous node to the output in order to obtain the true value. 
    # Please refer to "primary.py" for more information.
    component_node2 = command_component_fun(input_int=component_node1.outputs.output_int,
                                            input_float=component_node1.outputs.output_float,
                                            input_str=component_node1.outputs.output_str,
                                            input_bool=component_node1.outputs.output_bool)


pipeline_job = pipeline_with_components_use_primitive_output()

# set pipeline level compute
pipeline_job.settings.default_compute = "cpu-cluster"

In [None]:
# Inspect built pipeline
print(pipeline_job)

### 2.3 Submit pipeline job

In [None]:
# Submit pipeline job to workspace
pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="pipeline_samples"
)

pipeline_job

In [None]:
# Wait until the job completes
ml_client.jobs.stream(pipeline_job.name)

# 3. Define and Load components by `@command_component`
### 3.1 Define component function, and define primitive output in function
(`@pipeline`, `@dynamic` and `@command_component` are used the same way when using primary output)

#### 3.1.1 Define a group primary outputs using `@group`

In [None]:
# Please review the definition of the "component_return_annotated_group_outputs" function
from components_with_annotated import component_return_annotated_group_outputs
print(component_return_annotated_group_outputs)

#### 3.1.2 Define a primary output
(`int`, `float`, `string`, `bool` are used the same way when define a primary output)

In [None]:
# Please review the definition of the "component_return_annotated_output", "component_return_int_output", "component_return_integer_output" function
from components_with_annotated import component_return_annotated_output, component_return_int_output, component_return_integer_output

print(component_return_annotated_output)
print(component_return_int_output)
print(component_return_integer_output)

In [None]:
### 3.2 Build pipeline

In [None]:
@pipeline()
def pipeline2_with_components_use_primitive_output():
    component_node1 = component_return_annotated_group_outputs(input_int=1, input_float=1.1,
                                                               input_str='str', input_bool=True)
    component_node2 = component_return_annotated_group_outputs(input_int=component_node1.outputs.output_int,
                                                               input_float=component_node1.outputs.output_float,
                                                               input_str=component_node1.outputs.output_str,
                                                               input_bool=component_node1.outputs.output_bool)
    
    component_node3 = component_return_annotated_output(input_int=2)
    component_node4 = component_return_annotated_output(input_int=component_node3.outputs.output)
    
    component_node5 = component_return_int_output(input_int=3)
    component_node6 = component_return_int_output(input_int=component_node5.outputs.output)

    component_node7 = component_return_integer_output(input_int=4)
    component_node8 = component_return_integer_output(input_int=component_node7.outputs.output)

pipeline_job2 = pipeline2_with_components_use_primitive_output()

# set pipeline level compute
pipeline_job2.settings.default_compute = "cpu-cluster"

In [None]:
# Inspect built pipeline
print(pipeline_job2)

### 3.3 Submit pipeline job

In [None]:
# Submit pipeline job to workspace
pipeline_job2 = ml_client.jobs.create_or_update(
    pipeline_job2, experiment_name="pipeline_samples"
)

pipeline_job2

In [None]:
# Wait until the job completes
ml_client.jobs.stream(pipeline_job2.name)