<a href="https://www.kaggle.com/code/angelchaudhary/case-study-from-model-to-api?scriptVersionId=290189105" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Deploying ML Models with FastAPI: An End-to-End Case Study

# Introduction

Training a machine learning model is only half the job. In real-world systems, models must be served reliably, respond quickly and integrate cleanly with applications through APIs.

This case study focuses on bridging the gap between offline model training and online inference, highlighting the practical challenges that arise when a model is exposed to real users and real inputs.

**What This Case Study Covers**

In this notebook, we'll:
- Train a machine learning model locally
- Wrap the trained model inside a FastAPI inference service
- Expose a /predict endpoint for real-time predictions
- Test the API with realistic inputs
- Analyze how inference-time constraints differ from training-time assumptions

The goal is not just to make predictions but to understand what it takes to move a model into a production like environment.

## Approach
- Train a simple, reproducible ML model
- Serialize the model for reuse
- Load the model inside an API service
- Validate inputs and handle inference safely
- Test the service as an external client would

# LET'S DO IT!!!
![FUNNY GIF](https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExc2J0amEwcmM5NmRmbTB1NHM2NW5oMGg2Y3FtamFlNjVnZzN2dWN0OCZlcD12MV9zdGlja2Vyc19zZWFyY2gmY3Q9cw/F73KLZL9eAfDcDQFAt/giphy.gif)

### Dataset and Model Choice
Weâ€™ll use a tabular classification dataset where Inputs are numerical, Output is a binary prediction and Easy to validate via API later like Breast Cancer Wisconsin Dataset (from sklearn)
This avoids dataset loading issues and keeps the case study focused on system integration not data wrangling.
And for the model, weâ€™ll use Logistic Regression because of Fast inference, Deterministic behavior, Easy to serialize which is Commonly used in production for binary decisions.

In [1]:
# Load the data and train model
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import joblib

In [2]:
# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

In [3]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [7]:
# Train model
model = LogisticRegression(max_iter=500)
model.fit(X_train, y_train)

STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [8]:
# Evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracy

0.956140350877193

## Observation
The model achieves strong accuracy on held-out data, confirming that it has learned a meaningful decision boundary. At this point, the model performs well inside a notebook, where inputs are clean, controlled, and preprocessed correctly. However, none of this guarantees the model will behave safely or reliably when exposed to real-world inputs via an API.

In [9]:
# saving the model
joblib.dump(model, "breast_cancer_model.joblib")

['breast_cancer_model.joblib']

## Building the Inference API with FastAPI

Expose the trained ML model through a REST API so it can be consumed by external applications. At this stage, the key focus shifts to,
- Input validation
- Model loading
- Safe and consistent inference
- Clear API contracts

FastAPI is widely used for ML inference because it is fast and lightweight, inforces input schemas, Auto-generates API docs and Fits naturally into backend systems.

### Structure of the API
ml_api/
- app.py
- breast_cancer_model.joblib

In [11]:
import joblib

In [13]:
# Load trained model
model = joblib.load("breast_cancer_model.joblib")

In [19]:
joblib.dump(model, "breast_cancer_model.joblib")

['breast_cancer_model.joblib']

Rest of the part of this step is here: 

## Testing the API & Understanding Failure Cases

### a. Successful Inference (Baseline)
Request
{
  "features": [30 numeric values in correct order]
}

Response
{
  "prediction": 0
}

## Observation

When inputs match the exact format expected by the model, inference works smoothly. The model behaves deterministically and returns predictions with low latency. This confirms that the trained model can be successfully served and consumed via an API.

### b. Invalid Input Type
Request
{
  "features": "not a list"
}


Response
{
  "detail": [
    {
      "loc": ["body", "features"],
      "msg": "value is not a valid list",
      "type": "type_error.list"
    }
  ]
}


### Observation
FastAPI automatically rejects malformed requests before they reach the model. Input validation at the API layer prevents unnecessary model execution and guards against unexpected crashes.

### c. Wrong Feature Length (Critical Failure)
Request
{
  "features": [1.0, 2.0, 3.0]
}

Response
Model inference fails internally due to shape mismatch.

### Observation

Unlike training, where data pipelines enforce consistent shapes, API-based inference relies entirely on the correctness of incoming requests. Even a valid JSON payload can cause inference failure if the feature vector length does not match the training schema.

# conclusion
This case study demonstrated the complete journey from model training to production-style inference using an API. By serving the model through FastAPI, we uncovered practical challenges related to environment setup, input validation, and inference constraints that are invisible during training.

The key lesson is that machine learning models live inside systems, and their real-world reliability depends as much on integration, validation, and tooling as on model accuracy.

ðŸ”— **Deployment Code**
The FastAPI inference service for this project is available here:
https://github.com/Angell-14/model-to-api-fastapi-Case-Study-