Finally, we need to specify a version for the data and components we will create while running this notebook. This should be unique for the workspace, but the specific value doesn't matter:

## Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the MLClient from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace.

In [2]:
# Enter details of your AML workspace
subscription_id = "757c4165-0823-49f7-9678-5a85fe5e17cc"
resource_group = "amlss"
workspace = "amlss"

In [3]:
# Handle to the workspace
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
ml_client = MLClient(
    credential=credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    workspace_name=workspace,
)
print(ml_client)

MLClient(credential=<azure.identity._credentials.default.DefaultAzureCredential object at 0x7f67f2369090>,
         subscription_id=757c4165-0823-49f7-9678-5a85fe5e17cc,
         resource_group_name=amlss,
         workspace_name=amlss)


In [4]:
# Get handle to azureml registry for the RAI built in components
registry_name = "amlss"
ml_client_registry = MLClient(
    credential=credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    registry_name=registry_name,
)
print(ml_client_registry)

MLClient(credential=<azure.identity._credentials.default.DefaultAzureCredential object at 0x7f67f2369090>,
         subscription_id=757c4165-0823-49f7-9678-5a85fe5e17cc,
         resource_group_name=amlss,
         workspace_name=None)


MLFlow by default sees the workspace registry uri not standalone registry uri.

In [5]:
import mlflow

mlf_client = mlflow.tracking.MlflowClient()

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)

azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/workspaces/amlss


If you want to utilize the standalone registry or another workspace  you can set the registry with `mlflow.set_registry_uri`.
This method can be used also switch from registry to registry for the **DEV to TEST** or **TEST to PROD** promotion process to switch betwen registries.

In [9]:

# Dev Workspace
devws_tracking_uri="azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/workspaces/amlss"

# Registry
reg_tracking_uri="azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/registries/amlss"
# Prod Workspace:
prodws_tracking_uri="azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlssprod/providers/Microsoft.MachineLearningServices/workspaces/amlssprod"


How to change which registry platform (which workspace/registry) we are using for MLflow model logging


In [43]:
# switch to prod workspace
mlflow.set_registry_uri(prodws_tracking_uri)

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)

# switch to registry
mlflow.set_registry_uri(reg_tracking_uri)

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)


# switch to dev workspace
mlflow.set_registry_uri(devws_tracking_uri)

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)

azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlssprod/providers/Microsoft.MachineLearningServices/workspaces/amlssprod
azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/registries/amlss
azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/workspaces/amlss


**Development Operations**

To have a clean work environment and easly accessible logging defining Experiment Details is a good practice. within one experiments you can have multiple runs. At the end the Model will be logged with the runID.

In [11]:
experiment = mlflow.set_experiment("Promotion Experiments")
#experiment = mlflow.get_experiment_by_name("Default")
print("Experiment_id: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Tags: {}".format(experiment.tags))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
print("Creation timestamp: {}".format(experiment.creation_time))


Experiment_id: f763f5aa-41c6-45b0-8dfb-848e2c04b538
Artifact Location: 
Tags: {}
Lifecycle_stage: active
Creation timestamp: 1675034554845


Lets train a simple model to be able to register. we will be working with iris dataset for simplycity

In [12]:
import mlflow.sklearn

from sklearn.datasets import load_iris
from sklearn import tree

iris = load_iris()

You can either create a run and refer to run object or use a withclouse and do every step within with clause. for notebook usage creating a run object (Second option) is more feasable.

In [13]:
import time
run_name_suffix = int(time.time())
run_name = f"flower_runs_{run_name_suffix}"
print(run_name)

flower_runs_1675084863


In [8]:
# Option 1

with mlflow.start_run(run_name=run_name) as run:
    mlflow.log_metric('mymetric', 1)
    mlflow.log_metric('anothermetric',1)

In [9]:
mlflow.end_run()

In [14]:
# Option 2

mlflow_run = mlflow.start_run(run_name=run_name)

You can access the information about the run from the run object instance:

In [15]:
print("Active run_id: {}".format(mlflow_run.info.run_id))

print("Active run_name: {}".format(mlflow_run.info.run_name))

Active run_id: f40b4c44-d60e-4af1-aba7-5007f1d44006
Active run_name: flower_runs_1675084863


If you need to end the active run you can do it as below:

In [None]:
# mlflow.end_run()

**Methods of registering a model **

**1.Simplest way: Using autologging capabilities:**

If you want to have more control in model registration you can turn off the `log_models=False` and use other methods.

We can initiate autologging of all produced metrics and artifacts including model itself by initiating mlflow autologging capabilities. This can be done either by using generic built in autologging in mlflow or `mlflow.autolog` or outologging capabilities specialized for integrated libraries like sklearn or xgboost etc.

All paremeters defailt mlflow logging that mlflow.autolog can get and their default values:

```
mlflow.autolog(log_input_examples: bool = False, 
               log_model_signatures: bool = True, log_models: bool = True, 
               disable: bool = False, exclusive: bool = False, 
               disable_for_unsupported_versions: bool = False, 
               silent: bool = False)
```
All paremeters that mlflow.sklearn.autolog can get and their default values:

```
mlflow.sklearn.autolog(log_input_examples=False, log_model_signatures=True
                     , log_models=False, disable=False, exclusive=False
                     , disable_for_unsupported_versions=False, silent=False
                     , max_tuning_runs=5, log_post_training_metrics=True
                     , serialization_format='cloudpickle'
                     , registered_model_name=None, pos_label=None)
```

In [16]:
# If log_models True, trained models are logged as MLflow model artifacts. 
# If False, trained models are not logged. Input examples and model signatures, 
# which are attributes of MLflow models, are also omitted when log_models is False.
mlflow.autolog(log_models=False)


2023/01/30 13:21:26 INFO mlflow.tracking.fluent: Autologging successfully enabled for sklearn.


Below is a sample logging and model training/fitting flow to be able to get a showcase model.
since we turned off modellogging for autologging, the model wont be logged after executing below code block:

In [17]:

from sklearn.model_selection import train_test_split

# X -> features, y -> label
X = iris.data
y = iris.target

# dividing X, y into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

from sklearn.metrics import confusion_matrix,accuracy_score


sk_model = tree.DecisionTreeClassifier()
sk_model = sk_model.fit(iris.data, iris.target)
# set the artifact_path to location where experiment artifacts will be saved
y_pred = sk_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

# Signature
signature = mlflow.models.signature.infer_signature(X_test, y_pred)  

mlflow.log_metric('Accuracy', float(accuracy))
# creating a confusion matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
mlflow.log_param("criterion", sk_model.criterion)
mlflow.log_param("splitter", sk_model.splitter)


[[13  0  0]
 [ 0 16  0]
 [ 0  0  9]]


'best'

**2. Manually logging with `log_model`**

In [18]:
model_name="flower_model"

In [19]:

registered_model_name=f"{model_name}_MANUAL_LOG_MODEL"

# Registering the model to the workspace, within an ml run
# not every parameter of native mlflow.sklearn.log_model can be used, 
# for example we can not use metadata to log additional information about the model.
print("Registering the model via MLFlow")
model_log=mlflow.sklearn.log_model(
    sk_model=sk_model,
    registered_model_name=registered_model_name,
    artifact_path=registered_model_name
)

Registering the model via MLFlow


Registered model 'flower_model_MANUAL_LOG_MODEL' already exists. Creating a new version of this model...
2023/01/30 13:21:54 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: flower_model_MANUAL_LOG_MODEL, version 2
Created version '2' of model 'flower_model_MANUAL_LOG_MODEL'.


![mlflow.sklearn.log_model execution result view in workspace ui](./media/ws_model_registration_UI.png)

You can fetch details like model uri from loggin operation to use in further steps:

In [20]:
print(model_log.model_uri)

runs:/f40b4c44-d60e-4af1-aba7-5007f1d44006/flower_model_MANUAL_LOG_MODEL


**3. Register the model with REGISTER_MODEL by URI**

You can  purposefully register an already logged  model with its uri

In [21]:
# Registering the model to the workspace with tagging (log_model does not allow us to use tags) 
print("Registering the model via MLFlow")

registered_model_name=f"{model_name}_REGISTER_WITH_MODEL_URI"
# options : (model_uri, name, await_registration_for=300, *, tags: Optional[Dict[str, Any]] = None)
result = mlflow.register_model(
    model_uri = model_log.model_uri,
    name = registered_model_name,
)

print(result)

Registering the model via MLFlow
<ModelVersion: creation_timestamp=1675084942146, current_stage='None', description='', last_updated_timestamp=1675084942146, name='flower_model_REGISTER_WITH_MODEL_URI', run_id='f40b4c44-d60e-4af1-aba7-5007f1d44006', run_link='', source=('azureml://experiments/Promotion '
 'Experiments/runs/f40b4c44-d60e-4af1-aba7-5007f1d44006/artifacts/flower_model_MANUAL_LOG_MODEL'), status='READY', status_message='', tags={}, user_id='', version='2'>


In [22]:
print(result.version)

2


You can set additional tagging information which can be usefull in identifying details related to previous registries workspaces of the model, for example we are here adding two tags, DW:dev workspace, DV:dev version with the assumption of model names will be kept same in every workspace/registery

In [23]:
mlf_client.set_model_version_tag(registered_model_name, result.version, key="DW", value="amlss")
mlf_client.set_model_version_tag(registered_model_name, result.version, key="DV", value="2")
mlf_client.set_model_version_tag(registered_model_name, result.version, key="status", value="draft")

You can search for an existing model in a specific workspace/registery  once you set the tracking url to that workspace/registery.

In [26]:
mlf_client.set_model_version_tag(registered_model_name, result.version, key="status", value="test")

In [32]:
import ast
my_string = "{'key':'val','key2':2}"
my_dict = ast.literal_eval(my_string)

print(my_dict["key2"])

2


In [47]:
ready_to_promote_model_name=registered_model_name
ready_to_promote_run_id=""
ready_to_promote_m_version=0

search_result=mlf_client.search_registered_models(f"name='{registered_model_name}'")
for res in search_result:
    for mv in res.latest_versions:
        if mv.tags["status"]=="test":
            print("name={}; run_id={}; version={}; tags={}".format(mv.name, mv.run_id, mv.version, mv.tags))
            ready_to_promote_run_id=mv.run_id
            ready_to_promote_m_version = mv.version
print(f"model name:{registered_model_name}, run_id:{ready_to_promote_run_id}, version:{ready_to_promote_m_version}")

name=flower_model_REGISTER_WITH_MODEL_URI; run_id=f40b4c44-d60e-4af1-aba7-5007f1d44006; version=2; tags={'DW': 'amlss', 'DV': '2', 'status': 'test'}
model name:flower_model_REGISTER_WITH_MODEL_URI, run_id:f40b4c44-d60e-4af1-aba7-5007f1d44006, version:2


**4.  Register the model with REGISTER_MODEL from saved local copy**

First we should have a saved copy of our model:

In [39]:
import os
# # Saving the model to a file to create a local copy example
model_local_path = os.path.abspath("./trained_model")
print("Saving the model via MLFlow")
mlflow.sklearn.save_model(
    sk_model=sk_model,
    path=model_local_path
)


Saving the model via MLFlow


In [40]:
registered_model_name=f"{model_name}_REGISTER_WITH_LOCAL_MODEL_FILE"
result=mlflow.register_model(f"file://{model_local_path}", name=registered_model_name)

print(result)

Registered model 'flower_model_REGISTER_WITH_LOCAL_MODEL_FILE' already exists. Creating a new version of this model...
2023/01/30 13:37:23 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: flower_model_REGISTER_WITH_LOCAL_MODEL_FILE, version 2
Created version '2' of model 'flower_model_REGISTER_WITH_LOCAL_MODEL_FILE'.


We can use this saved model to promote to prod as well:

In [41]:
# first we need to change our workspace/registry
# switch to prod workspace
mlflow.set_registry_uri(prodws_tracking_uri)

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)

azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlssprod/providers/Microsoft.MachineLearningServices/workspaces/amlssprod


In [42]:
# then register the model to prod:

registered_model_name=f"{model_name}_REGISTER_WITH_LOCAL_MODEL_FILE"
result=mlflow.register_model(f"file://{model_local_path}", name=registered_model_name)

print(result)

Successfully registered model 'flower_model_REGISTER_WITH_LOCAL_MODEL_FILE'.
2023/01/30 13:41:10 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: flower_model_REGISTER_WITH_LOCAL_MODEL_FILE, version 1
Created version '1' of model 'flower_model_REGISTER_WITH_LOCAL_MODEL_FILE'.


In [None]:
# Add the tags to the prod so that it will be easy to go back to dev:

mlf_client.set_model_version_tag(registered_model_name, result.version, key="DW", value="amlss")
mlf_client.set_model_version_tag(registered_model_name, result.version, key="DV", value="2")                                                               


In [48]:
# after promotion mark the dev model status tag
# switch to dev workspace
mlflow.set_registry_uri(devws_tracking_uri)

ws_reg_uri=mlflow.get_registry_uri()
print(ws_reg_uri)

mlf_client.set_model_version_tag(ready_to_promote_model_name, ready_to_promote_m_version, key="status", value="prod")



azureml://northeurope.api.azureml.ms/mlflow/v1.0/subscriptions/757c4165-0823-49f7-9678-5a85fe5e17cc/resourceGroups/amlss/providers/Microsoft.MachineLearningServices/workspaces/amlss


In [49]:
search_result=mlf_client.search_registered_models(f"name='{ready_to_promote_model_name}'")
for res in search_result:
    for mv in res.latest_versions:
        if mv.run_id==ready_to_promote_run_id:
            print("name={}; run_id={}; version={}; tags={}".format(mv.name, mv.run_id, mv.version, mv.tags))

name=flower_model_REGISTER_WITH_MODEL_URI; run_id=f40b4c44-d60e-4af1-aba7-5007f1d44006; version=2; tags={'DW': 'amlss', 'DV': '2', 'status': 'prod'}


In [None]:
mlflow.end_run()