# Build model importing pipeline with components registered in registry

**Requirements** - In order to benefit from this tutorial, you will need:
- A basic understanding of Machine Learning
- An Azure account with an active subscription - [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- An Azure ML workspace with computer cluster - [Configure workspace](../../configuration.ipynb)
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../../README.md) - check the getting started section

**Learning Objectives** - By the end of this tutorial, you should be able to:
- Connect to your AML workspace from the Python SDK.
- Import components from registry.
- Create `Pipeline` using registered components.
- Submit `Pipeline Job` in workspace
- Import model into workspace/registry from Huggingface/Azurestorage using above pipeline

**Motivations** - This notebook explains how to create model importing/publishing pipeline job in workspace using components registered in registry

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

## 1.1 Import the required libraries

In [2]:
# Import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

from azure.ai.ml import MLClient, Input, Output
from azure.ai.ml.dsl import pipeline
from azure.ai.ml import load_component

## 1.2 Configure credential

We are using `DefaultAzureCredential` to get access to workspace. 
`DefaultAzureCredential` should be capable of handling most Azure SDK authentication scenarios. 

Reference for more available credentials if it does not work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).

In [None]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

## 1.3 Get a handle to the workspace and the registry

We use config file to connect to a workspace. The Azure ML workspace should be configured with computer cluster. [Check this notebook for configure a workspace](../../configuration.ipynb)

In [None]:
# Get a handle to workspace
ml_client_ws = MLClient.from_config(credential=credential)

ml_client_registry = MLClient(credential, registry_name="azureml-preview-test1")

# Retrieve an already attached MSI attached Azure Machine Learning Compute.
cluster_name = "cpu-standard-d13-msi"

# Note above compute has an MSI attached to it, which is necessary to get credentials and run cli/sdk v2 on that compute

## 1.4 Ensure cluster has an MSI attached to it

In [None]:
compute_cluster = ml_client_ws.compute.get(name=cluster_name)
if (
    compute_cluster.identity
    and compute_cluster.identity.type == "user_assigned"
    and compute_cluster.identity.user_assigned_identities[0].type == "managed_identity"
):
    print("Compute cluster has MSI attached to it")
else:
    raise (
        "Compute does not have an MSI attached to it, Kindly use the compute having an MSI attached"
    )

# 2. Load components from registry to create pipeline

## 2.1 Use registery handle to load the components in workspace

### Components
- download_model - Downloads model from HuggingFace/AzureStorage in Compute Disk
- mlflow_converter - Converts the model format from custom to mlflow format
- local_validation   - Validate mlflow model
- register_model     - Register the model in workspace/registry

In [None]:
model_downloader = ml_client_registry.components.get(
    name="download_model", version="0.0.9"
)
mlflow_converter = ml_client_registry.components.get(
    name="convert_model_to_mlflow", label="0.0.8"
)
local_validation = ml_client_registry.components.get(
    name="mlflow_model_local_validation", label="0.0.3"
)
model_registration = ml_client_registry.components.get(
    name="register_model", label="0.1.6"
)

# 3. Build pipeline using above components

## 3.1 Connect components and build pipeline framework

In [3]:
@pipeline(
    description="Publishing models into registry/workspace using the registered componentns",
)
def model_publishing_pipeline(model_id,
                              registry_name,
                              license_file_path,
                              model_metadata):
    model_downloading_node = model_downloader(model_id=model_id)
    mlflow_converter_node = mlflow_converter(
        model_path=model_downloading_node.outputs.model_output,
    )
    local_validation_node = local_validation(model_path = mlflow_converter_node.outputs.mlflow_model_folder)
    model_registration_node = model_registration(
        model_path=local_validation_node.outputs.mlflow_model_folder,
        license_file_path = license_file_path,
        model_metadata = model_metadata,
        model_download_metadata = model_downloading_node.outputs.model_download_metadata,
        registry_name = registry_name
    )

## 3.2 Pass parameters to pipeline and set up compute

In [None]:
MODEL_ID = ("bert-base-uncased",)

pipeline_job = model_publishing_pipeline(
    model_id=MODEL_ID,
    registry_name = "azureml-preview-test1",
    model_metadata = Input(type = "uri_file", path = "import_model_data/bert-base-uncase-metadata.json" ),
    license_file_path = Input(type = "uri_file", path = "import_model_data/LICENSE.txt")   
)

# set pipeline level compute
pipeline_job.settings.default_compute = cluster_name

## 3.3 Submit job to workspace

In [None]:
pipeline_job = ml_client_ws.jobs.create_or_update(
    pipeline_job, experiment_name="Model Importing Pipeline"
)
pipeline_job

In [None]:
# Wait until the job completes
ml_client_ws.jobs.stream(pipeline_job.name)

# 4 Test the deployment using sample data

## 4.1 Invoke the endpoint using sample data

In [None]:
ml_client_ws.online_endpoints.invoke(
    endpoint_name=ENDPOINT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    request_file="./data/bert-base-uncased-test.json",
    registry_name = "azureml-preview-test1",
    model_metadata = Input(type = "uri_file", path = "import_model_data/bert-base-uncase-metadata.json" ),
    license_file_path = Input(type = "uri_file", path = "import_model_data/LICENSE.txt")
)

# 5. Deleting the endpoints created

## 5.1 Delete the endpoint created

In [None]:
ml_client_ws.online_endpoints.begin_delete(name=ENDPOINT_NAME)