# 🚀 Module 3: Model Packaging and Deployment on Kubernetes

In this module, we will:
1. Load the Trained Model from MLflow
2. Build a REST API for model inference
3. Create a Dockerfile to containerize the service
4. Deploy the container to a Kubernetes cluster (Minikube or OpenShift)
5. Optionally, expose and test the deployed service

## 📦 Import Required Libraries

Before we proceed with training and tracking our machine learning model, we need to import the necessary libraries.


In [1]:
# Import necessary modules
import os
import joblib

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

import pandas as pd
import numpy as np

import mlflow
import mlflow.sklearn
from mlflow.tracking import MlflowClient

## 🧳 Select and Load a trained Model Version from MLflow

In this step, we interact with the MLflow Model Registry to:

1. **List all available versions** of a registered model (`BikeSharingModel`) along with their metadata, such as version number, stage, and run ID.
2. **Prompt the user** to choose a specific version to use for deployment or analysis.
3. **Download the selected model** from the MLflow tracking server using the model URI.

This makes it easy to manage multiple iterations of a model and ensures reproducibility when deploying or testing specific versions.


In [2]:
## Initialize MLflow client
MLFLOW_TRACKING_URL = 'https://mlflow-mlflow.apps.cluster-x5r72.dynamic.redhatworkshops.io'
mlflow.set_tracking_uri(f"{MLFLOW_TRACKING_URL}")
client = MlflowClient()

model_name = "BikeSharingModel"

# List available versions
versions = client.search_model_versions(filter_string=f"name='{model_name}'", order_by=["version_number DESC"])

print("📦 Available versions for model:", model_name)
for v in versions:
    print(f"Version: {v.version}, Stage: {v.current_stage}, Status: {v.status}, Run ID: {v.run_id}")

# Ask the user to select a version
selected_version = input("Enter the version number you want to download: ").strip()

# Load the selected model version
model_uri = f"models:/{model_name}/{selected_version}"
model = mlflow.pyfunc.load_model(model_uri=model_uri)

print(f"✅ Model version {selected_version} loaded successfully from MLflow.")

📦 Available versions for model: BikeSharingModel
Version: 1, Stage: None, Status: READY, Run ID: a7750d532dd641e78c6c7879cc1b79ac


Enter the version number you want to download:  1


Downloading artifacts:   0%|          | 0/5 [00:00<?, ?it/s]

✅ Model version 1 loaded successfully from MLflow.


## 💾 Save the Selected Model Locally

After downloading the desired model version from MLflow, we save it to the local `models/` directory using the `joblib` format.

This step is essential for:
- Packaging the model into a Docker container
- Making the model available to inference services (e.g., FastAPI or Flask)
- Versioning models on disk for offline use or audit trails

The model file is named using the selected version number to avoid confusion and maintain clarity (e.g., `bike_model_v3.pkl`).


In [3]:
# Optionally, save it locally for container usage
os.makedirs("./models", exist_ok=True)
model_path = f"./models/bike_model_v{selected_version}.pkl"
joblib.dump(model, model_path)

print(f"✅ Model version {selected_version} downloaded from MLflow and saved to {model_path}")

✅ Model version 1 downloaded from MLflow and saved to ./models/bike_model_v1.pkl


## 🛠️ Create a REST API using FastAPI
This API will load the model and expose an endpoint for predictions.

In [4]:
%%writefile ./models/app.py
from fastapi import FastAPI
import joblib
import pandas as pd

app = FastAPI()
model = joblib.load("bike_model.pkl")

@app.post("/predict")
def predict(features: dict):
    df = pd.DataFrame([features])
    prediction = model.predict(df)[0]
    return {"prediction": prediction}

Writing ./models/app.py


## 📦 Containerize with Docker
Create a Dockerfile for the FastAPI app.

In [5]:
%%writefile ./models/Cntainerfile
FROM python:3.9-slim
WORKDIR /app
COPY bike_model.pkl app.py ./
RUN pip install fastapi[all] joblib pandas
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Writing ./models/Cntainerfile


## 🧱 Build and Run Docker Container Locally

In [5]:
!docker build -t bike-api ./models
!docker run -d -p 8000:8000 --name bike-api bike-api

/usr/bin/sh: line 1: docker: command not found
/usr/bin/sh: line 1: docker: command not found


## ☸️ Deploy to Kubernetes
Create a Kubernetes deployment and service manifest.

In [None]:
%%writefile ./models/k8s_deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: bike-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: bike-api
  template:
    metadata:
      labels:
        app: bike-api
    spec:
      containers:
      - name: bike-api
        image: bike-api:latest
        ports:
        - containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
  name: bike-api-service
spec:
  selector:
    app: bike-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8000
  type: NodePort

## 🚀 Deploy to Kubernetes

In [None]:
!kubectl apply -f ./models/k8s_deployment.yaml

## 🧪 Test the API Endpoint

In [None]:
# Replace <NodePort> with the actual exposed port
!curl -X POST "http://localhost:<NodePort>/predict" -H "Content-Type: application/json" -d '{"temp": 25, "hum": 0.8, "windspeed": 0.1}'

## ✅ Summary
- Exported the trained model
- Built a FastAPI service for prediction
- Containerized the API using Docker
- Deployed the container to Kubernetes
- Exposed and tested the endpoint