🚚 Truck Delay Prediction — End-to-End ML Pipeline

A production-grade machine learning pipeline that predicts truck shipment delays, built for deployment on Lightning.ai with a Flask REST API.

🏗️ Architecture

MySQL DB ──┐
           ├─→ ETL Pipeline ─→ Feature Engineering ─→ Model Training ─→ Flask API
Postgres ──┘                                           (RF / XGB / LGBM)

📁 Project Structure

truck_delay_ml/
├── config.yaml                  # Central config (no hardcoded values)
├── run_pipeline.py              # One command: ETL + Training
├── requirements.txt
├── .env.example                 # Secret template (never commit .env!)
│
├── ml_pipeline/
│   ├── etl/
│   │   ├── db_connector.py      # MySQL + PostgreSQL connections + mock data
│   │   ├── extractor.py         # Extract & merge from both DBs
│   │   ├── transformer.py       # Feature engineering & cleaning
│   │   └── loader.py            # Save/load parquet files
│   ├── modeling/
│   │   └── trainer.py           # Multi-model training + MLflow tracking
│   └── utils/
│       ├── config_loader.py     # YAML + env var loader
│       └── logger.py            # Rotating file + console logger
│
├── deployment/
│   └── flask_app.py             # REST API with /predict and /predict/batch
│
└── tests/
    └── test_pipeline.py         # pytest unit tests

🚀 Quick Start on Lightning.ai

1. Clone & setup

git clone https://github.com/YOUR_USERNAME/truck_delay_ml.git
cd truck_delay_ml
pip install -r requirements.txt

2. Configure environment

cp .env.example .env
# Edit .env with your DB credentials
# Or set MOCK_DATA=true to skip DB and use synthetic data

3. Run the full pipeline

# With mock data (no database needed):
python run_pipeline.py --mock

# With real databases:
python run_pipeline.py

4. Start the Flask API

python deployment/flask_app.py

5. Test the API

curl -X POST http://localhost:5000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "distance_km": 850,
    "truck_type": "Large",
    "truck_age_years": 9,
    "driver_experience": 2,
    "cargo_weight_kg": 15000,
    "weather_condition": "Rain",
    "route_type": "Rural",
    "traffic_index": 0.85,
    "road_quality": "Poor",
    "num_stops": 4
  }'

Expected response:

{
  "prediction": 1,
  "label": "Delayed",
  "probability": 0.8231,
  "confidence": "82.3%",
  "risk_level": "High"
}

🧪 Run Tests

pytest tests/ -v

📊 API Endpoints

Method	Endpoint	Description
GET	`/`	Health check
POST	`/predict`	Single prediction
POST	`/predict/batch`	Batch predictions (max 1000)
GET	`/model/info`	Feature list & model type
POST	`/reload`	Hot-reload model after retraining

🔬 Models Compared

Model	CV F1	Notes
Random Forest	~0.84	Robust, good baseline
XGBoost	~0.86	Fast, handles missing values
LightGBM	~0.87	Best — used in production

✨ Key Features

No hardcoded values — everything in config.yaml
MLflow experiment tracking — compare all runs visually
Mock data mode — test the full pipeline without any database
Production Flask API — /predict and /predict/batch endpoints
Automatic logging — predictions logged to CSV for monitoring
Unit tests — pytest coverage for all pipeline stages

🛠️ Tech Stack

Python · scikit-learn · XGBoost · LightGBM · MLflow · Flask · SQLAlchemy · pandas · pytest

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
deployment		deployment
ml_pipeline		ml_pipeline
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.lesshst		.lesshst
.vscode-server		.vscode-server
README.md		README.md
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚚 Truck Delay Prediction — End-to-End ML Pipeline

🏗️ Architecture

📁 Project Structure

🚀 Quick Start on Lightning.ai

1. Clone & setup

2. Configure environment

3. Run the full pipeline

4. Start the Flask API

5. Test the API

🧪 Run Tests

📊 API Endpoints

🔬 Models Compared

✨ Key Features

🛠️ Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚚 Truck Delay Prediction — End-to-End ML Pipeline

🏗️ Architecture

📁 Project Structure

🚀 Quick Start on Lightning.ai

1. Clone & setup

2. Configure environment

3. Run the full pipeline

4. Start the Flask API

5. Test the API

🧪 Run Tests

📊 API Endpoints

🔬 Models Compared

✨ Key Features

🛠️ Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages