## MODEL DEPLOYMENT WITH FAST API

# 🎯 Aim of the Notebook: Deploying a Machine Learning Model as a Web Service with FastAPI and Docker

This notebook walks through the full process of deploying a trained machine learning model as a RESTful web service using **FastAPI**, **Pipenv**, and **Docker**. The service allows users to send JSON input and receive predictions from the model via HTTP. It is ideal for production or local testing environments and exposes the model on port `9696`.

---

## ✅ Steps Overview


1. **Write a Python script to handle model predictions**
2. **Embed the script into a FastAPI application**
3. **Package the application using Docker**
4. **Expose the service via Uvicorn**
5. **Run the container and test the endpoint**

---

## 📦 Create the Virtual Environment inside docker

```bash
- docker build -t ride-duration-prediction-service:v1 .
- docker run -it --rm -p 9696:9696  ride-duration-prediction-service:v1

In [1]:
import joblib
import pandas as pd



In [2]:
preprocessor= joblib.load("preprocessing.pkl")

In [3]:
import xgboost as xgb
from xgboost import XGBRegressor

# For regression models
model = XGBRegressor()
model.load_model('my_model.ubj')

In [4]:
# Now transform a new sample
df = pd.DataFrame([{
    "PULocationID": 75,
    "DOLocationID": 235,
    "trip_distance": 5.93
}])

# Transform the new data
X_processed = preprocessor.transform(df)

In [5]:
X_processed

array([[ 1.18426239, 14.71795397, 14.71795397]])

In [6]:
feature = model.predict(X_processed)

In [8]:
feature[0]

19.054037

In [19]:
%%writefile predict.py

import joblib
import pandas as pd
from xgboost import XGBRegressor

def load_preprocessor(path: str):
    try:
        preprocessor = joblib.load(path)
        print("[INFO] Preprocessor loaded successfully.")
        return preprocessor
    except FileNotFoundError:
        print(f"[ERROR] Preprocessor file not found: {path}")
        raise
    except Exception as e:
        print(f"[ERROR] Failed to load preprocessor: {e}")
        raise

def load_model(path: str):
    try:
        model = XGBRegressor()
        model.load_model(path)
        print("[INFO] Model loaded successfully.")
        return model
    except FileNotFoundError:
        print(f"[ERROR] Model file not found: {path}")
        raise
    except Exception as e:
        print(f"[ERROR] Failed to load model: {e}")
        raise

def predict_duration(preprocessor, model, ride_df):
    try:
        X_processed = preprocessor.transform(ride_df)
        prediction = model.predict(X_processed)
        return prediction[0]
    except Exception as e:
        print(f"[ERROR] Prediction failed: {e}")
        raise

def predict_from_dict(ride: dict):
    """function to predict from a simple ride dictionary."""
    try:
        preprocessor = load_preprocessor("preprocessing.pkl")
        model = load_model("my_model.ubj")

        df = pd.DataFrame([ride])
        return predict_duration(preprocessor, model, df)
    except Exception as e:
        print(f"[ERROR] Failed to predict from dict: {e}")
        return None

if __name__ == "__main__":
    try:
        preprocessor = load_preprocessor("preprocessing.pkl")
        model = load_model("my_model.ubj")
        predicted_duration = predict_duration(preprocessor, model, ride)
        print(f"[RESULT] Predicted trip duration: {predicted_duration:.2f} minutes")
    except Exception:
        print("[FAILED] Prediction pipeline could not complete.")


Overwriting predict.py


In [26]:
%%writefile test.py


import predict

ride = {
    "PULocationID": 75,
    "DOLocationID": 40,
    "trip_distance": 5
}

time = predict.predict_from_dict(ride)

if time is not None:
    print(f"Predicted duration: {time:.2f} minutes")
else:
    print("Prediction failed.")



Overwriting test.py


In [27]:
!python test.py

[INFO] Preprocessor loaded successfully.
[INFO] Model loaded successfully.
Predicted duration: 18.37 minutes


## CREATE THE FAST API APP

In [35]:
import os
os.makedirs("templates",exist_ok=True )

In [37]:
%%writefile templates/predict_form.html

<!DOCTYPE html>
<html>
<head>
    <title>Trip Duration Predictor</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background: #f4f6f9;
            padding: 20px;
        }
        .container {
            background: white;
            padding: 30px;
            border-radius: 12px;
            max-width: 500px;
            margin: auto;
            box-shadow: 0 0 10px rgba(0,0,0,0.1);
        }
        input[type="number"] {
            width: 100%;
            padding: 8px;
            margin: 8px 0 20px;
            border: 1px solid #ccc;
            border-radius: 6px;
        }
        button {
            background: #007BFF;
            color: white;
            border: none;
            padding: 10px 20px;
            border-radius: 6px;
            cursor: pointer;
        }
        button:hover {
            background: #0056b3;
        }
        .result {
            margin-top: 20px;
            background: #e6f7ff;
            padding: 10px;
            border-left: 4px solid #007BFF;
        }
    </style>
</head>
<body>
    <div class="container">
        <h2>Trip Duration Predictor</h2>
        <form method="post">
            <label>PULocationID:</label>
            <input type="number" name="PULocationID" required>

            <label>DOLocationID:</label>
            <input type="number" name="DOLocationID" required>

            <label>Trip Distance (miles):</label>
            <input type="number" step="0.01" name="trip_distance" required>

            <button type="submit">Predict Duration</button>
        </form>

        {% if result %}
        <div class="result">
            <strong>Prediction:</strong> {{ result }} minutes
        </div>
        {% endif %}
    </div>
</body>
</html>



Overwriting templates/predict_form.html


In [38]:
%%writefile main.py


from fastapi import FastAPI, HTTPException, Request, Form
from fastapi.responses import HTMLResponse
from fastapi.templating import Jinja2Templates
from pydantic import BaseModel
import joblib
import pandas as pd
from xgboost import XGBRegressor
import uvicorn
import logging
import warnings
from typing import List

# Suppress category_encoders warning
warnings.filterwarnings('ignore', category=FutureWarning, module='category_encoders')

# Logging config
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Initialize FastAPI app
app = FastAPI(
    title="Trip Duration Prediction API",
    description="API and Web UI for predicting NYC taxi trip duration using XGBoost",
    version="1.0.0"
)

# Templates setup
templates = Jinja2Templates(directory="templates")

# Globals
preprocessor = None
model = None

# Input schema
class RideData(BaseModel):
    PULocationID: int
    DOLocationID: int
    trip_distance: float

    class Config:
        schema_extra = {
            "example": {
                "PULocationID": 75,
                "DOLocationID": 235,
                "trip_distance": 5.93
            }
        }

# Output schema
class PredictionResponse(BaseModel):
    predicted_duration: float
    status: str
    message: str

# Load preprocessor
def load_preprocessor(path: str):
    try:
        preprocessor = joblib.load(path)
        logger.info("Preprocessor loaded.")
        return preprocessor
    except Exception as e:
        logger.error(f"Error loading preprocessor: {e}")
        raise

# Load model
def load_model(path: str):
    try:
        model = XGBRegressor()
        model.load_model(path)
        logger.info("Model loaded.")
        return model
    except Exception as e:
        logger.error(f"Error loading model: {e}")
        raise

# Prediction logic
def predict_duration(preprocessor, model, ride_df):
    try:
        X_processed = preprocessor.transform(ride_df)
        prediction = model.predict(X_processed)
        return float(prediction[0])
    except Exception as e:
        logger.error(f"Prediction error: {e}")
        raise

# Load on startup
@app.on_event("startup")
async def startup_event():
    global preprocessor, model
    preprocessor = load_preprocessor("preprocessing.pkl")
    model = load_model("my_model.ubj")

# Health check
@app.get("/")
async def root():
    return {"message": "Trip Duration Prediction API is running"}

@app.get("/health")
async def health_check():
    return {
        "status": "healthy",
        "preprocessor_loaded": preprocessor is not None,
        "model_loaded": model is not None
    }

# Single prediction
@app.post("/predict", response_model=PredictionResponse)
async def predict(ride_data: RideData):
    if preprocessor is None or model is None:
        raise HTTPException(status_code=500, detail="Models not loaded.")
    try:
        df = pd.DataFrame([ride_data.dict()])
        duration = predict_duration(preprocessor, model, df)
        return PredictionResponse(
            predicted_duration=duration,
            status="success",
            message="Prediction completed successfully"
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Prediction failed: {str(e)}")

# Batch prediction
@app.post("/predict_batch")
async def predict_batch(rides: List[RideData]):
    if preprocessor is None or model is None:
        raise HTTPException(status_code=500, detail="Models not loaded.")
    try:
        results = []
        for ride in rides:
            df = pd.DataFrame([ride.dict()])
            duration = predict_duration(preprocessor, model, df)
            results.append(duration)
        return {
            "predictions": results,
            "status": "success",
            "message": f"Predicted durations for {len(rides)} rides"
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Batch prediction failed: {str(e)}")

# Web form GET
@app.get("/form", response_class=HTMLResponse)
async def form_get(request: Request):
    return templates.TemplateResponse("predict_form.html", {"request": request, "result": None})

# Web form POST
@app.post("/form", response_class=HTMLResponse)
async def form_post(
    request: Request,
    PULocationID: int = Form(...),
    DOLocationID: int = Form(...),
    trip_distance: float = Form(...)
):
    try:
        if preprocessor is None or model is None:
            raise HTTPException(status_code=500, detail="Model not loaded.")
        df = pd.DataFrame([{
            "PULocationID": PULocationID,
            "DOLocationID": DOLocationID,
            "trip_distance": trip_distance
        }])
        result = predict_duration(preprocessor, model, df)
        return templates.TemplateResponse("predict_form.html", {
            "request": request,
            "result": f"{result:.2f}"
        })
    except Exception as e:
        logger.error(f"Form prediction error: {e}")
        return templates.TemplateResponse("predict_form.html", {
            "request": request,
            "result": "Error during prediction"
        })

# Run server
if __name__ == "__main__":
    uvicorn.run("main:app", host="0.0.0.0", port=9696, reload=True)


Overwriting main.py


In [39]:
!uvicorn main:app --host 0.0.0.0 --port 9696 --reload
#http://localhost:9696/docs
#http://localhost:9696/form

[32mINFO[0m:     Will watch for changes in these directories: ['/Users/gabriel/Documents/Mlops_zoomcamp/02-Deployment/web-server']
[32mINFO[0m:     Uvicorn running on [1mhttp://0.0.0.0:9696[0m (Press CTRL+C to quit)
[32mINFO[0m:     Started reloader process [[36m[1m52699[0m] using [36m[1mStatReload[0m
* 'schema_extra' has been renamed to 'json_schema_extra'
[32mINFO[0m:     Started server process [[36m52701[0m]
[32mINFO[0m:     Waiting for application startup.
INFO:main:Preprocessor loaded.
INFO:main:Model loaded.
[32mINFO[0m:     Application startup complete.
[32mINFO[0m:     127.0.0.1:58588 - "[1mGET /docs HTTP/1.1[0m" [32m200 OK[0m
[32mINFO[0m:     127.0.0.1:58588 - "[1mGET /openapi.json HTTP/1.1[0m" [32m200 OK[0m
[32mINFO[0m:     127.0.0.1:58591 - "[1mGET /form HTTP/1.1[0m" [32m200 OK[0m
[32mINFO[0m:     127.0.0.1:58593 - "[1mGET /apple-touch-icon-precomposed.png HTTP/1.1[0m" [31m404 Not Found[0m
[32mINFO[0m:     127.0.0.1:58594 - "[

### DOCKER FILE

In [1]:
%%writefile requirements.txt

fastapi==0.115.12
uvicorn==0.34.2
pandas==2.2.2
scikit-learn==1.5.1
xgboost==3.0.1
category-encoders==2.6.3
jinja2==3.1.3
joblib==1.4.2
email-validator==2.1.1
python-multipart


Overwriting requirements.txt


In [2]:
%%writefile Dockerfile

##base image
FROM python:3.12-slim


# Set working directory
WORKDIR /app

# Copy files,model and preprocessor
COPY . /app

#Install dependencies
RUN pip install --upgrade pip \
    && pip install --no-cache-dir -r requirements.txt


# Expose port
EXPOSE 9696

# Run the app

ENTRYPOINT ["uvicorn", "main:app", "--host=0.0.0.0", "--port=9696"]



Overwriting Dockerfile
