# Model deployment
You've trained a model, evaluated it using the testing dataset to make sure that it satisfies some performance criteria. The next step is to put your model into production where it can receive input from real world and make inferences. Model deployment is the process of integrating your model into production so that its inferences are available to users. 

This week's tutorials are split into several sections. We'll start from the different patterns of deploying a model, where you'll see how to embed a model into an application directly as well as deploy a model as a separate inference service using KServe. Next, you'll get some experience with some more features provided by KServe, including canary deployment, horizontal scaling, and inference graph. If we take a look at the overview architecture of our MLOps platform, the focus of this week is KServe. 

<img src="./images/overview-kserve-focus.jpg" width=700 />

## Preparation
### 1. Create a new Conda environment
Before get started, let's create a new Conda environment:
```bash
# Run the command under the same directory as this notebook
conda env create -f mlops_eng2_environment.yml
conda activate mlops_eng2
```
**Please run the tutorials and do the assignments using this `mlops_eng2` environment.**

### 2. Train a model
Next, let's train a model that we'll use in the tutorials. We'll use the red wine quality dataset to train an sklearn model. 

In [1]:
import os
import logging

import mlflow
import mlflow.sklearn
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet

# Set an environmental variable named "MLFLOW_S3_ENDPOINT_URL" so that MLflow client knows where to save artifacts.
# The MinIO storage service can be accessed via http://mlflow-minio.local
os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://mlflow-minio.local"

# Configure the credentials needed for accessing the MinIO storage service.
# "AWS_ACCESS_KEY_ID" has been configured in a ComfigMap and "AWS_SECRET_ACCESS_KEY" in a Secret in your K8s cluster when you set up the MLOps platform
os.environ["AWS_ACCESS_KEY_ID"] = "minioadmin"
os.environ["AWS_SECRET_ACCESS_KEY"] = "minioadmin"

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

MLFLOW_TRACKING_URI = "http://mlflow-server.local" # This is the URL of the MLflow service
MLFLOW_EXPERIMENT_NAME = "week4-red-wine-quality"


def main():
    # Read the wine-quality csv file from the URL
    csv_url = (
        "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
    )

    data = pd.read_csv(csv_url, sep=";")

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    
    # Just use hard-coded hyperparameters
    alpha = 0.5
    l1_ratio = 0.5

    logger.info(f"Using MLflow tracking URI: {MLFLOW_TRACKING_URI}")

    # Configure the MLflow client to connect to the MLflow service
    mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

    logger.info(f"Using MLflow experiment: {MLFLOW_EXPERIMENT_NAME}")
    mlflow.set_experiment(MLFLOW_EXPERIMENT_NAME)

    with mlflow.start_run():
        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)

        logger.info("Fitting model...")

        lr.fit(train_x, train_y)

        logger.info("Finished fitting")

        logger.info("Logging parameters to MLflow")
        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)

        logger.info("Logging trained model")
        artifact_name = "wine-quality"
        mlflow.sklearn.log_model(
            lr, artifact_name, registered_model_name="Week4ElasticnetWineModel")
        print("The S3 URI of the logged model:", mlflow.get_artifact_uri(artifact_path=artifact_name))

main()

INFO:__main__:Using MLflow tracking URI: http://mlflow-server.local
INFO:__main__:Using MLflow experiment: week4-red-wine-quality
INFO:__main__:Fitting model...
INFO:__main__:Finished fitting
INFO:__main__:Logging parameters to MLflow
INFO:__main__:Logging trained model
INFO:botocore.credentials:Found credentials in environment variables.
Registered model 'Week4ElasticnetWineModel' already exists. Creating a new version of this model...
2024/11/28 14:28:13 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: Week4ElasticnetWineModel, version 5


The S3 URI of the logged model: s3://mlflow/6/15e32724433c43a884bd9462758a5e64/artifacts/wine-quality


Created version '5' of model 'Week4ElasticnetWineModel'.


This week's tutorials consist of several notebooks:

1. [Model deployment patterns](./tutorials/1_deployment_patterns.ipynb)
2. [Canary deployment](./tutorials/2_canary_deployment.ipynb)
3. [Horizontal scaling](./tutorials/3_horizontal_scaling.ipynb)
4. [Inference graph](./tutorials/4_inference_graph.ipynb)