
## Scenarios for sharing models, environments and components

There are two scenarios where you'd want to use the same set of models, components and environments in multiple workspaces. First is cross-workspace MLOps. You are trining a model in a dev workspace and need to deploy it to test and prod workspaces. In this case you want to have end-to-end lineage between endpoints to which the model is deployed in test or prod workspaces and the training job, metrics, code, data and environment that was used to train the model in the dev workspace. Second is to share and reuse models and pipelines across different teams in your organization that in turn improves collaboration and productivity. In this scenario, you may want to publish a trained model and the associated components and environments used to train the model to a central catalog where colleagues from other teams and search and reuse assets shared by you in their experiments. You will learn how to create models, components and environments in a Azure Machine Learning registry and use them in any workspace within your organization.

In [None]:
pre-req, goals - todo

In [None]:
# Import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

from azure.ai.ml import MLClient, Input, Output
from azure.ai.ml.dsl import pipeline
from azure.ai.ml import load_component
from azure.ai.ml.entities import Environment, BuildContext

!pip show azure-ai-ml

In [None]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

## Connect to a workspace and registry

If you do not have a workspace, refer to this notebook to create a workspace: [](../workspace/workspace.ipynb)
If you do not have a registry, refer to this notebook to create a registry: todo

We will initialize two clients, one to connect to workspace and other connect to registry

In [None]:
ml_client_workspace = MLClient( credential=credential,
    subscription_id = "21d8f407-c4c4-452e-87a4-e609bfb86248",
    resource_group_name = "rg-contoso-819prod",
    workspace_name = "mlw-contoso-819prod")
print(ml_client_workspace)

ml_client_registry = MLClient ( credential=credential,
        subscription_id = "ea4faa5b-5e44-4236-91f6-5483d5b17d14",
        resource_group_name = "bug-bash-rg1",
        registry_name = "ContosoMLjun14")
print(ml_client_registry)



### Create environment in registry

In [61]:
env_docker_context = Environment(
    build=BuildContext(path="../../../cli/jobs/pipelines-with-components/nyc_taxi_data_regression/env_train/"),
    name="SKLearnEnv",
    version=str(1),
    description="Scikit Learn environment",
)
ml_client_registry.environments.create_or_update(env_docker_context)

Environment({'is_anonymous': False, 'auto_increment_version': False, 'name': 'SKLearnEnv', 'description': 'Scikit Learn environment', 'tags': {}, 'properties': {}, 'id': 'azureml://registries/ContosoMLjun14/environments/SKLearnEnv/versions/1', 'Resource__source_path': None, 'base_path': '/mnt/c/CODE/REPOS/azureml-examples/sdk/resources/registry', 'creation_context': <azure.ai.ml._restclient.v2021_10_01_dataplanepreview.models._models_py3.SystemData object at 0x7ff2b4184580>, 'serialize': <msrest.serialization.Serializer object at 0x7ff2b4196d00>, 'version': '1', 'latest_version': None, 'conda_file': None, 'image': 'mlregcsix4.azurecr.io/sklearnenv_1_89f98d1d-9796-59d4-8ffb-5985ab0a2237', 'build': None, 'inference_config': None, 'os_type': 'Linux', 'arm_type': 'environment_version', 'conda_file_path': None, 'path': None, 'upload_hash': None, 'translated_conda_file': None})

### Fetch environment from registry

In [None]:
env_from_registry = ml_client_registry.environments.get(name="SKLearnEnv", version=str(1))
print(env_from_registry)

### Create component in registry

We will use the component YAMLs defined in `cli/jobs/pipelines-with-components/nyc_taxi_data_regression` for this. 

Load the component from YAML

In [None]:
parent_dir = "../../../cli/jobs/pipelines-with-components/nyc_taxi_data_regression"
train_model = load_component(path=parent_dir + "/train.yml")
# print the component as yaml
print(train_model)

## 2.2 Create components in registry

Note that we use the `ml_client_registry` handle becuase we plan to create components in registry. Creating the components in Registry allows us to use these components in many workspaces. 

A similar sample notebook shows how to create these components in workspaces instead of registry, in which case you can use those components only in the specific workspace: https://github.com/Azure/azureml-examples/blob/main/sdk/jobs/pipelines/1e_pipeline_with_registered_components/pipeline_with_registered_components.ipynb

We are using a dynamic version number so that there is no conflict with versions that already exist. 

We are also using the environment created in the previous section instead of using the curated environment in the original `train.yml`


In [None]:
# get or create component

# dynamic version number based on epoch time
import time
version_timestamp = str(int(time.time()))
print(version_timestamp)

#train_model.environment="azureml://registries/ContosoMLjun14/environments/SKLearnEnv/versions/1"
train_model.environment=env_from_registry
train_model.version=version_timestamp

#kwargs_version = {"version": version_timestamp }

ml_client_registry.components.create_or_update(train_model)
print(train_model)


### Fetch component from Registry

In [None]:
train_component_from_registry = ml_client_registry.components.get(name="train_linear_regression_model", version=str(1))
print(train_component_from_registry)

## Create a pipeline job


In [None]:
@pipeline()
def pipeline_with_registered_components(
    training_data
):
    train_job = train_component_from_registry(
        training_data=training_data,
    )
pipeline_job = pipeline_with_registered_components(
    training_data=Input(type="uri_folder", path=parent_dir + "/data_transformed/"),
)
pipeline_job.settings.default_compute = "cpu-cluster"
print(pipeline_job)

### Submit pipeline job

Submit pipeline job and wait for it to complete


In [None]:
pipeline_job = ml_client_workspace.jobs.create_or_update(
    pipeline_job, experiment_name="sdk_job_component_from_registry" ,  skip_validation=True
)
ml_client_workspace.jobs.stream(pipeline_job.name)
pipeline_job=ml_client_workspace.jobs.get(pipeline_job.name)
pipeline_job

## Create model in registry

### Option 1: Create model in registry from local files



### Download the model from the output of training job

In [None]:
jobs=ml_client_workspace.jobs.list(parent_job_name=pipeline_job.name)
for job in jobs:
    if (job.display_name == "train_job"):
        print (job.name)
        ml_client_workspace.jobs.download(job.name)

### Create model from local files



In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
import time
mlflow_model = Model(
    path="./artifacts/model/",
    type=AssetTypes.MLFLOW_MODEL,
    name="nyc-taxi-model",
    version=str(1), # use str(int(time.time())) if you want a random model number
    description="MLflow model created from local path",
)
ml_client_registry.model.create_or_update(mlflow_model)

### Option 2: Copy model from workspace to registry

Get the job name of train_job and build the path pointing to model output 

In [None]:
jobs=ml_client_workspace.jobs.list(parent_job_name=pipeline_job.name)
for job in jobs:
    if (job.display_name == "train_job"):
        print (job.name)
        model_path_from_job="azureml://jobs/{job_name}/outputs/default/model".format(job_name=job.name)

print(model_path_from_job)
     

### Create model in workspace from job output

In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
import time
version_timestamp = str(int(time.time()))
print(version_timestamp)

mlflow_model = Model(
    path=model_path_from_job,
    type=AssetTypes.MLFLOW_MODEL,
    name="nyc-taxi-model",
    version=version_timestamp,
    description="MLflow model created from job output",
)
ml_client_workspace.create_or_update(mlflow_model)

## Copy a model from workspace to registry

In [None]:

#model_path_from_workspace="azureml://subscriptions/<subscription-id-of-workspace>/resourceGroups/<resource-group-of-workspace>/workspaces/<workspace-name>/models/<model-name>/versions/<model-version>

model_path_from_workspace="azureml://subscriptions/21d8f407-c4c4-452e-87a4-e609bfb86248/resourceGroups/rg-contoso-819prod/workspaces/mlw-contoso-819prod/models/nyc-taxi-model/versions/1663099093"

print(model_path_from_workspace)

mlflow_model = Model(
    path=model_path_from_workspace,
#    type=AssetTypes.MLFLOW_MODEL,
#    name="nyc-taxi-model",
#    version=version_timestamp,
#    description="MLflow model created from job output",
)
ml_client_registry.create_or_update(mlflow_model)
     

### Deploy model from registry to online endpoint in workspace

### Get model from registry

In [None]:
mlflow_model_from_registry = ml_client_registry.models.get(name="nyc-taxi-model", version=str(1))
print(mlflow_model_from_registry)

### Create an online endpoint 

In [None]:
import datetime

from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)

online_endpoint_name = "endpoint-" + datetime.datetime.now().strftime("%m%d%H%M%f")

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is a sample online endpoint for mlflow model",
    auth_mode="key"
)
ml_client_workspace.begin_create_or_update(endpoint)

### Deploy the model from registry to the online endpoint

In [None]:
# create a demo deployment
demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=mlflow_model_from_registry,
    instance_type="Standard_F4s_v2",
    instance_count=1
)
ml_client_workspace.online_deployments.begin_create_or_update(demo_deployment)

endpoint.traffic = {"demo": 100}
ml_client_workspace.begin_create_or_update(endpoint)


### Test the deployment

In [None]:
# test the  deployment with some sample data
ml_client_workspace.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file=parent_dir + "/scoring-data.json"
)

### Clean up resources - delete online endpoint and registry

In [None]:
ml_client_workspace.online_endpoints.begin_delete(name=online_endpoint_name)