Skip to content

ChandanKT-git/IPL-Predictions

Repository files navigation

IPL Match Prediction System

🏏 A full-stack machine learning application that predicts IPL cricket match outcomes using real-time data integration, advanced ML models, and intelligent caching strategies.

Model Version Test MAE API Docs Python React

πŸ“š Documentation

Current project documentation:


πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • Node.js 16+
  • MongoDB 6.0+
  • Cricbuzz RapidAPI key
  • Groq API key for AI analysis (optional)

Installation

  1. Clone & Setup
git clone https://github.com/yourusername/ipl-predictions.git
cd ipl-predictions
  1. Backend Setup
cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys
  1. Frontend Setup
cd frontend

# Install dependencies
npm install

# Configure environment
cp .env.example .env
  1. Start Services
# Start MongoDB
mongod --dbpath ./data/db

# Start Backend (Terminal 1)
cd backend
python -m uvicorn server:app --reload --port 8000

# Start Frontend (Terminal 2)
cd frontend
npm start
  1. Access Application

✨ Key Features

🎯 Advanced ML Predictions

  • Score Prediction with 80% confidence intervals (P10-P90 quantiles)
  • Win Probability with calibrated probabilities (Brier score: 0.21)
  • Phase Breakdown: Powerplay, Middle overs, Death overs
  • Model Version: 5.0-chronological-calibrated

πŸ“Š Feature Engineering

  • Team strength ratings (batting/bowling)
  • Recent form (last 5 matches)
  • Head-to-head statistics
  • Toss impact analysis
  • Venue effects
  • Pitch & weather modifiers
  • 21 engineered features from 283K historical rows

πŸ”„ Real-time Integration

  • Live IPL match data from Cricbuzz API
  • Auto-fill match details (teams, venue, playing XI)
  • Live score updates via Server-Sent Events (SSE)
  • Upcoming match schedule

πŸ’Ύ Intelligent Caching

  • In-process TTL cache for live Cricbuzz payloads and resolved catalog data
  • Graceful degradation (live β†’ cache β†’ fallback)
  • Source tracking (X-Data-Source header)
  • No external Redis dependency in the current runtime

πŸ“ˆ Analytics & Insights

  • Feature Contributions (SHAP-style importance)
  • Batter-Bowler Matchups from historical data
  • H2H Analysis (season-wise, venue-specific)
  • Form Guide (recent performance trends)
  • Model Calibration metrics

🎨 Modern UI/UX

  • React 18 with Tailwind CSS
  • Live Match Picker (auto-fill from current matches)
  • Team Logos (official IPL branding)
  • Player Avatars (with fallback to team-colored initials)
  • What-If Scenarios (slider-based exploration)
  • Responsive Design (mobile-friendly)

πŸ” Observability

  • Structured JSON logging
  • Request ID tracking (distributed tracing)
  • Performance monitoring
  • Health checks (/api/health)

πŸ—οΈ Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   React     β”‚  ← Tailwind CSS, Axios, React Query
β”‚  Frontend   β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ HTTP/REST
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI   β”‚  ← CORS, Middleware, Async
β”‚   Backend   β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
  β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β–Ό          β–Ό          β–Ό          β–Ό
β”Œβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ML β”‚   β”‚ In-Process β”‚   β”‚Mongo β”‚   β”‚Cricbuzzβ”‚
β”‚Modelβ”‚   β”‚ TTL Cache  β”‚   β”‚  DB  β”‚   β”‚  API   β”‚
β””β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜

For detailed architecture, see docs/ARCHITECTURE.md


πŸ€– Machine Learning Pipeline

Model Architecture

Input Features (21-dim)
         ↓
    Preprocessing
    (Scaling + Encoding)
         ↓
    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
    β–Ό          β–Ό
Score Model  Win Model
(RF Regressor) (Calibrated)
    β”‚          β”‚
    β–Ό          β–Ό
P10/P50/P90   Win Prob
Quantiles     (0-100%)

Training Strategy

  • Chronological Split: Train ≀2023, Val 2024, Test 2025+
  • No Data Leakage: Respects temporal ordering
  • 574 Training Matches: IPL seasons 2008-2023
  • Test MAE: 38.5 runs (Β±2 overs)

Key Features

Feature Importance Description
Team Batting Rating 0.28 Last 20 matches batting performance
Recent Form 0.22 Win rate in last 5 matches
Toss Won by Batting 0.18 Strategic advantage
Venue Effect 0.15 Ground characteristics
H2H Win Share 0.12 Historical dominance

For complete model details, see docs/MODEL_DOCUMENTATION.md


πŸ“‘ API Endpoints

Core Endpoints

Method Endpoint Description
GET /api/health System health check
GET /api/teams List all IPL teams
GET /api/teams/{id}/players Team players
GET /api/venues List venues
POST /api/predict Generate prediction
GET /api/live-matches Current IPL matches
GET /api/head-to-head/{a}/{b} H2H statistics

For complete API documentation, see docs/API_REFERENCE.md

Example: Prediction Request

curl -X POST http://localhost:8000/api/predict \
  -H "Content-Type: application/json" \
  -d '{
    "team_a": "mi",
    "team_b": "csk",
    "batting_team": "mi",
    "toss_winner": "mi",
    "venue": "wankhede",
    "pitch": "batting",
    "weather": "clear",
    "playing_xi_a": ["Rohit Sharma", "Ishan Kishan", ...],
    "playing_xi_b": ["Ruturaj Gaikwad", "Devon Conway", ...]
  }'

Example: Prediction Response

{
  "id": "15d1f59f-a410-464b-8623-f69f293e431e",
  "predicted_score": 175,
  "score_range_low": 145,
  "score_range_high": 205,
  "win_probability_batting": 65,
  "win_probability_bowling": 35,
  "phase_breakdown": {
    "powerplay_runs": 48,
    "middle_overs_runs": 72,
    "death_overs_runs": 55
  },
  "model_version": "5.0-chronological-calibrated",
  "contributions": [...],
  "h2h": {...},
  "matchups": [...]
}

πŸ§ͺ Testing

Backend Tests

cd backend

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Run specific test
pytest tests/test_predict.py -v

Frontend Tests

cd frontend

# Run tests
npm test

# Run with coverage
npm test -- --coverage

Test Coverage

  • Backend: 85% coverage
  • Frontend: 72% coverage
  • Integration Tests: API routes, prediction flow, data resolution

πŸ”§ Configuration

Environment Variables

Backend (.env)

# Cricbuzz API (RapidAPI)
CRICBUZZ_API_KEY=your_api_key_here

# Database
MONGO_URL=mongodb://localhost:27017
DB_NAME=ipl_predictor

# Optional: AI Analysis
GROQ_API_KEY=your_groq_key_here

# Allowed frontend origin(s)
CORS_ORIGINS=http://localhost:3000

Frontend (.env)

REACT_APP_BACKEND_URL=http://localhost:8000

πŸ“Š Model Performance

Metrics Summary

Metric Value Interpretation
Test MAE 38.5 runs Average error β‰ˆ 2 overs
Test RMSE 48.2 runs Penalizes large errors
RΒ² Score 0.48 Explains 48% of variance
Win Accuracy 65.2% Better than baseline (50%)
Brier Score 0.21 Well-calibrated probabilities
Coverage (80%) 78.5% Close to target

Model Versioning

Version Date Test MAE Notes
1.0-heuristic 2024-01 58.2 Rule-based baseline
2.0-random-split 2024-03 25.4 ❌ Data leakage
3.0-chronological 2024-06 41.8 βœ… Honest metrics
4.0-calibrated 2024-09 39.2 βœ… Better probabilities
5.0-chronological-calibrated 2024-12 38.5 βœ… Current (with quantiles)

πŸ› οΈ Tech Stack

Backend

  • Framework: FastAPI
  • ML: scikit-learn, pandas, numpy, joblib
  • Database: MongoDB (motor)
  • API Client: httpx (async)
  • Testing: pytest
  • Logging: Python logging (JSON format)

Frontend

  • Framework: React 18
  • Styling: Tailwind CSS
  • HTTP Client: Axios
  • State Management: React Query
  • Icons: Lucide React
  • Build: Create React App (CRACO)

Infrastructure

  • Server: Uvicorn (ASGI)
  • Database: MongoDB 6.0+
  • Cache: In-process TTL cache
  • Deployment: Render/Vercel or Docker (optional)

πŸ” Security

  • βœ… Environment variables for secrets
  • βœ… .env files gitignored
  • βœ… Input validation (Pydantic)
  • βœ… CORS configuration
  • βœ… MongoDB authentication
  • βœ… Environment-based API key handling
  • ⚠️ Rate limiting (recommended for production)

πŸ“ˆ Future Enhancements

High Priority πŸ”΄

  • Separate Chase Model - Model RRR and pressure situations
  • Conformal Prediction Intervals - Guaranteed coverage
  • Player-Level Features - Individual form and matchups in model

Medium Priority 🟑

  • SHAP Explainability - Interactive feature importance
  • Online Learning - Update model after each match
  • A/B Testing - Compare model versions

Low Priority 🟒

  • Neural Network Ensemble - Marginal performance gains
  • WebSocket Support - Bidirectional real-time updates
  • Mobile App - React Native implementation

🀝 Contributing

We welcome contributions! Please see DEVELOPMENT_GUIDE.md for:

  • Development workflow
  • Code style guidelines
  • Testing requirements
  • Pull request process

Quick Contribution Guide

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/your-feature)
  3. Make changes with tests
  4. Commit (git commit -m 'feat: add feature')
  5. Push (git push origin feature/your-feature)
  6. Create Pull Request

πŸ“ License

This project is licensed under the MIT License - see LICENSE file for details.


πŸ‘₯ Authors

  • Your Name - Initial work

πŸ™ Acknowledgments

  • Cricsheet - Historical ball-by-ball data
  • Cricbuzz API - Live match data (via RapidAPI)
  • IPL - Logo and branding assets
  • scikit-learn - ML framework
  • FastAPI - Web framework
  • React - UI library

πŸ“ž Support


πŸ”— Links


⭐ Star History

If you find this project useful, please consider giving it a star! ⭐


Made with ❀️ for Cricket Analytics

Last Updated: 2026-06-13
Model Version: 5.0-chronological-calibrated
Documentation Version: 1.0

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors