# Build Pipeline with Azure OpenAI CommandComponents from registry

**Requirements** - In order to benefit from this tutorial, you will need:
- A basic understanding of Machine Learning
- An Azure account with an active subscription - [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- An Azure ML workspace with computer cluster - [Configure workspace](../../configuration.ipynb)
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../../README.md) - check the getting started section

**Learning Objectives** - By the end of this tutorial, you should be able to:
- Connect to your AML workspace from the Python SDKv2
- Define and load Azure OpenAI `CommandComponent` from the registry
- Create `Pipeline` using loaded component.

**Motivations** - This notebook covers the scenario where a user can load OpenAI components from the registry to create a pipeline and submit the job using sdkv2 

# 1. Connect to Azure Machine Learning Registry

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1 Import the required libraries

In [None]:
# Import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

from azure.ai.ml import MLClient, Input
from azure.ai.ml.dsl import pipeline
from azure.ai.ml import load_component

## 1.2 Configure credential

We are using `DefaultAzureCredential` to get access to workspace which should be capable of handling most Azure SDK authentication scenarios. 

Reference for more available credentials if it does not work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).
Check the [configuration notebook](https://github.com/Azure/azureml-examples/blob/6142c51451561447befa665e8dd6fb3ff80bdb62/sdk/python/jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [None]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

## 1.3 Get a handle to the registry

We need to initialize a MlClient pointed to the registry where the OpenAI components are available. [Check this api documentation for more details](https://learn.microsoft.com/en-us/python/api/azure-ai-ml/azure.ai.ml.mlclient?view=azure-python)

In [None]:
# Get a handle to registry
ml_client = MLClient(
    credential=credential,
    registry_name="azure-openai-preview",
    registry_location="eastus",
)

# 2. Define and create components into workspace
## 2.1 Load components from registry

In [None]:
model_import_component = load_component(
    client=ml_client, name="openai_model_import", version="0.2.3"
)
finetune_component = load_component(
    client=ml_client, name="openai_completions_finetune", version="0.2.4"
)

## 2.2 Inspect loaded components

In [None]:
print(model_import_component)
print(finetune_component)

# 3. Sample pipeline job
## 3.1 Build pipeline

In [None]:
# Construct pipeline
@pipeline()
def pipeline_with_registered_components(
    train_dataset, validation_dataset, training_max_epochs=20, model="ada"
):
    """E2E pipeline that selects and does finetuning on an OpenAI model using CommandComponents"""
    # Call component obj as function: apply given inputs & parameters to create a node in pipeline
    selected_model = model_import_component(model="ada")
    selected_model.outputs.output_model.mode = "mount"
    finetune_results = finetune_component(
        train_dataset=train_dataset,
        validation_dataset=validation_dataset,
        base_model=selected_model.outputs.output_model,
        n_epochs=training_max_epochs,
        model=model,
        registered_model_name="mrpc_test_model",
    )
    finetune_results.outputs.output_model.mode = "mount"

    return finetune_results


pipeline_job = pipeline_with_registered_components(
    train_dataset=Input(type="uri_folder", path="data/"),
    validation_dataset=Input(type="uri_folder", path="data/"),
    training_max_epochs=1,
    model="ada",
)

# set pipeline level compute
pipeline_job.settings.default_compute = "serverless"

## 3.2 Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) as mentioned at the beginning of the notebook.

In [None]:
# Get a handle to workspace
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace_name = "<AML_WORKSPACE_NAME>"

    ml_client = MLClient(credential, subscription_id, resource_group, workspace_name)

## 3.3 Submit pipeline job

In [None]:
# Submit pipeline job to workspace
pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="mrpc_pipeline_test"
)
pipeline_job

In [None]:
# Wait until the job completes
ml_client.jobs.stream(pipeline_job.name)