# 05 | Deployment - LoanVet Credit Risk Model

This notebook documents the deployment pipeline for LoanVet's XGBoost credit risk model.

Key components:
- Model and metadata loading
- Batch and single-record prediction functions
- FastAPI backend (`src/api/app.py`)
- Streamlit frontend (`src/streamlit_app.py`)
- Deployment considerations and testing

In [None]:
import os
import json
import joblib
import logging
import numpy as np
import pandas as pd
from xgboost import XGBClassifier
from typing import Union, Dict

logging.basicConfig(level=logging.INFO)  # Enables logging to console

## Load Final Model & Data 

In [None]:
MODEL_PATH = '../models/final/xgb_final_model.joblib'
METADATA_PATH = '../models/final/xgb_final_metadata.json'

# Load model
model: XGBClassifier = joblib.load(MODEL_PATH)
print("Model loaded.")

# Load metadata
with open(METADATA_PATH, 'r') as f:
    metadata = json.load(f)

threshold = metadata.get("threshold", 0.5)
feature_list = metadata.get("features", [])

print(f"Threshold loaded: {threshold}")
print(f"Features loaded: {len(feature_list)} features")

Model loaded.
Threshold loaded: 0.2268
Features loaded: 19 features


## Prediction Functions

This section defines reusable prediction functions for the LoanVet credit risk model:

- **`predict_single`**: Takes a single input record as a dictionary, validates and orders features according to the trained model, predicts the probability of default, and applies the classification threshold to return a binary risk label alongside the probability score.

- **`predict_batch`**: Accepts a pandas DataFrame of multiple input records, ensures the correct feature order, computes prediction probabilities for all records, applies the threshold to generate binary labels, and returns the original DataFrame augmented with prediction probabilities and labels.

Encapsulating prediction logic in these functions promotes modularity and enables straightforward integration with batch processing pipelines, API endpoints, or downstream applications.

In [30]:
def predict_single(record: Dict[str, Union[float, int]]) -> Dict[str, Union[int, float]]:
    try:
        # Convert input dict to DataFrame with one row
        df = pd.DataFrame([record])

        # Ensure columns are in correct order and all features present
        df = df[feature_list]

        # Model predicts probability of positive class (1)
        proba = model.predict_proba(df)[:, 1][0]

        # Apply classification threshold
        label = int(proba >= threshold)

        return {"label": label, "probability": proba}

    except Exception as e:
        logging.error(f"Error in single prediction: {e}")
        raise

def predict_batch(df: pd.DataFrame) -> pd.DataFrame:
    try:
        # Ensure columns are in correct order and all features present
        df = df[feature_list]

        # Predict probabilities for positive class
        proba = model.predict_proba(df)[:, 1]

        # Apply threshold
        labels = (proba >= threshold).astype(int)

        # Append results
        df_result = df.copy()
        df_result["probability"] = proba
        df_result["label"] = labels

        return df_result

    except Exception as e:
        logging.error(f"Error in batch prediction: {e}")
        raise

## Deployment Architecture

LoanVet is deployed with a clean modular structure:

- `src/api/app.py`: FastAPI app exposing a `/predict` endpoint
- `src/api/utils.py`: Reusable prediction logic (e.g., `predict_single`, `predict_batch`)
- `src/streamlit_app.py`: Streamlit frontend for business users to interactively predict credit risk
- `models/final/`: Trained XGBoost model and feature metadata

This ensures separation of concerns between backend inference and frontend interaction.