In [0]:
%run ../demo_setup/_init 

# 4. 🚀 Model Serving

<div style="text-align: center;">
  <img src="../demo_setup/images/model_serving.png" width="1200px"/> 
</div>

#### Simplified deployment for all AI models and agents

Deploy any model type, from pretrained open source models to custom models built on your own data — on both CPUs and GPUs. Automated container build and infrastructure management reduce maintenance costs and speed up deployment so you can focus on building your AI agent systems and delivering value faster for your business

#### Unified management for all models

Manage all models, including custom ML models like PyFunc, scikit-learn and LangChain, foundation models (FMs) on Databricks like Llama 3, MPT and BGE, and foundation models hosted elsewhere like ChatGPT, Claude 3, Cohere and Stable Diffusion. Model Serving makes all models accessible in a unified user interface and API, including models hosted by Databricks, or from another model provider on Azure or AWS.


#### Governance built-in

Integrate with Mosaic AI Gateway to meet stringent security and advanced governance requirements. You can enforce proper permissions, monitor model quality, set rate limits, and track lineage across all models whether they are hosted by Databricks or on any other model provider.

![](https://www.databricks.com/sites/default/files/2023-09/simplified-deployment.png?v=1696033263)

# 4.1 🚀 CICD with Deployment Jobs

A key final step in the end-to-end machine learning lifecycle is to deploy our model as a REST API endpoint using **Databricks Model Serving.**

To accomplish this we will leverage best practice with **CICD** (continuous integration, continuous deployment) to automatically deploy our model once a new version is registered into unity catalog. We accomplish this via **Deployment Jobs.**

![](../demo_setup/images/simple-deployment-job.png)

Our Deployment job has the below steps:
- Approve model deployment
- Deploy new version of the model as a REST endpoint

This process is fully automated, all it requires is someone to approve the deployment after the job detects a new model version has been registered.

![](../demo_setup/images/deployment_approval.png)

See documentation here: [Deployment Jobs](https://docs.databricks.com/aws/en/mlflow/deployment-job)


In [0]:
# Generate workspace URL using current notebook context
workspace_url = dbutils.notebook.entry_point.getDbutils().notebook().getContext().browserHostName().get()

# Construct dynamic deployment job URL for Si Model
deployment_job_url = f"https://{workspace_url}/explore/data/models/{catalog_name}/{schema_name}/si_model"

displayHTML(f"""
<div style="padding: 10px; border-left: 4px solid #0073e6; background-color: #f0f8ff; margin: 10px 0;">
    <h4 style="margin: 0; color: #0073e6;">🚀 Deployment Job for Si Model</h4>
    <p style="margin: 5px 0 0 0;">
        <a href="{deployment_job_url}" target="_blank" style="font-size: 16px; color: #0073e6; text-decoration: none;">
            Click here to view the deployment job →
        </a>
    </p>
</div>
""")