# 04a. Model Registration to Unity Catalog

**Purpose**: Register trained model from MLflow to Unity Catalog

**Prerequisites**:
* Model training completed (03_model_training.ipynb)
* Training run exists in MLflow experiment

**Outputs**:
* Model registered in Unity Catalog
* Model version number
* Tags and descriptions applied

In [0]:
# Install dependencies
%pip install --upgrade typing_extensions>=4.6.0 pydantic>=2.0.0 --quiet
%pip install azure-storage-file-datalake azure-identity azure-core --quiet
!pip install -r /Workspace/Users/ashish.kamboj@tigeranalytics.com/home-credit-hyperpersonalization/requirements.txt

dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
import sys
import os
import mlflow
import pandas as pd

# Dynamically determine project root
project_root = os.path.dirname(os.getcwd()) if os.getcwd().endswith('notebooks') else os.getcwd()
if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils.common_utils import load_config, setup_logging, get_mlflow_client, print_section_header
from utils.model_deployment import (
    fetch_run_metrics, apply_model_tags, update_model_descriptions
)

print("✅ Imports successful")

✅ Imports successful


In [0]:
config = load_config('../config/config.yaml')
setup_logging(config)
mlflow_client = get_mlflow_client(config)

print_section_header("Model Registration to Unity Catalog")


                      Model Registration to Unity Catalog                       



In [0]:
# Get latest training run from MLflow
experiment_name = config['mlflow']['databricks']['experiment_name']
print(f"Looking for training runs in experiment: {experiment_name}")

experiment = mlflow.get_experiment_by_name(experiment_name)
if experiment is None:
    raise ValueError(f"Experiment '{experiment_name}' not found. Run training first.")

runs = mlflow.search_runs(
    experiment_ids=[experiment.experiment_id], 
    order_by=["start_time DESC"], 
    max_results=1
)

if len(runs) == 0:
    raise ValueError("No training runs found. Please run model training first.")

run_id = runs.iloc[0]['run_id']
model_uri = f"runs:/{run_id}/model"

print(f"✅ Found latest model run: {run_id}")
print(f"🔗 Model URI: {model_uri}")

# Fetch run metrics
run_metrics = fetch_run_metrics(run_id)
print("\n📊 Run Metrics:")
for k, v in run_metrics.items():
    print(f"  {k}: {v:.4f}")

Looking for training runs in experiment: /Users/ashish.kamboj@tigeranalytics.com/next-best-product-recommendation
✅ Found latest model run: b5b7dcadd50b4c23aad9ace5dc793cf4
🔗 Model URI: runs:/b5b7dcadd50b4c23aad9ace5dc793cf4/model

📊 Run Metrics:
  accuracy: 0.1053
  f1_weighted: 0.1057
  precision_weighted: 0.1155
  recall_weighted: 0.1053
  roc_auc_ovr: 0.4963
  top_1_accuracy: 0.0000
  top_3_accuracy: 0.0000
  top_5_accuracy: 0.0000


In [0]:
# Register model to Unity Catalog
model_name = config['mlflow']['databricks']['registered_model_name']

print(f"\nRegistering model to Unity Catalog...")
print(f"Model name: {model_name}")
print(f"Model URI: {model_uri}")

result = mlflow.register_model(
    model_uri=model_uri,
    name=model_name
)

model_version = result.version
print(f"\n✅ Model registered successfully!")
print(f"Version: {model_version}")


Registering model to Unity Catalog...
Model name: datafabric_catalog.customer_hc_silver.next_best_product_model
Model URI: runs:/b5b7dcadd50b4c23aad9ace5dc793cf4/model


Registered model 'datafabric_catalog.customer_hc_silver.next_best_product_model' already exists. Creating a new version of this model...


Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]


✅ Model registered successfully!
Version: 5


🔗 Created version '5' of model 'datafabric_catalog.customer_hc_silver.next_best_product_model': https://adb-1364099644588382.2.azuredatabricks.net/explore/data/models/datafabric_catalog/customer_hc_silver/next_best_product_model/version/5?o=1364099644588382


In [0]:
# Apply descriptions
workflow_cfg = config.get('deployment', {})
model_desc = workflow_cfg.get('model_descriptions', {}).get('registered', 'Next Best Product Recommendation Model')
version_desc = f"Trained with current feature set & includes Top-K metrics | Run: {run_id[:15]}"

update_model_descriptions(model_name, model_version, model_desc, version_desc)

# Apply tags
base_tags = workflow_cfg.get('tags', {})
metric_tags = {f"metric_{k}": f"{v:.4f}" for k, v in run_metrics.items() if isinstance(v, (int, float))}
all_tags = {
    **base_tags, 
    **metric_tags, 
    'source_run_id': run_id, 
    'environment': config['environment']['mode']
}
apply_model_tags(model_name, model_version, all_tags)

print("✅ Tags & descriptions applied")

✅ Tags & descriptions applied


In [0]:
print_section_header("Registration Summary")

print(f"""
✅ Model Registration Complete!

Model: {model_name}
Version: {model_version}
Run ID: {run_id}

Metrics:
""")

for metric, value in run_metrics.items():
    print(f"  {metric}: {value:.4f}")

print(f"""
👉 Next Steps:
1. Run 04b_set_staging_alias.ipynb to set 'staging' alias
2. Test the model in staging environment
3. Run 04c_set_production_alias.ipynb to promote to production
4. Run 04d_deploy_endpoint.ipynb to create serving endpoint
""")


                              Registration Summary                              


✅ Model Registration Complete!

Model: datafabric_catalog.customer_hc_silver.next_best_product_model
Version: 5
Run ID: b5b7dcadd50b4c23aad9ace5dc793cf4

Metrics:

  accuracy: 0.1053
  f1_weighted: 0.1057
  precision_weighted: 0.1155
  recall_weighted: 0.1053
  roc_auc_ovr: 0.4963
  top_1_accuracy: 0.0000
  top_3_accuracy: 0.0000
  top_5_accuracy: 0.0000

👉 Next Steps:
1. Run 04b_set_staging_alias.ipynb to set 'staging' alias
2. Test the model in staging environment
3. Run 04c_set_production_alias.ipynb to promote to production
4. Run 04d_deploy_endpoint.ipynb to create serving endpoint

