Chess MLOps Pipeline
From raw PGN ingestion to Transformer-powered next-move prediction, with full MLOps lifecycle coverage.
chessMania is an end-to-end MLOps project that demonstrates every stage of the machine learning lifecycle using chess game data. It starts with a tabular ML baseline (XGBoost) and scales into Transformer-based sequence modelling, showcasing research engineering rigor and modern ML operations practices.
| Component | Implementation |
|---|---|
| Storage | MinIO (PGN / JSON / Parquet) |
| Orchestration | Apache Airflow (PGN parsing β Feature & Sequence extraction) |
| Data Validation | Great Expectations |
| Data Warehouse | PostgreSQL |
| ML Framework | XGBoost β GPT-style Transformer with LoRA/QLoRA |
| Experiment Tracking | MLflow (Accuracy, F1, AUC, Perplexity, MFU) |
| Serving | FastAPI + Uvicorn (Next-move prediction, Win probability) |
| Monitoring | Evidently AI (Feature drift + Structural/Move-sequence drift) |
| Containerisation | Docker + Docker Compose |
chessMania/
β
βββ README.md # β You are here
βββ pyproject.toml # Dependencies & build config
βββ Makefile # Developer shortcuts
βββ Dockerfile # API container image
βββ .dockerignore
βββ .gitignore
βββ .env.example # Environment variable template
βββ .pre-commit-config.yaml # Linting & formatting hooks
βββ alembic.ini # Database migration config
βββ LICENSE
β
βββ src/ # βββ APPLICATION CODE βββ
β βββ __init__.py
β β
β βββ config/ # Centralised configuration
β β βββ __init__.py # Hydra/OmegaConf loader
β β βββ config.yaml # All settings (storage, models, servingβ¦)
β β
β βββ ingestion/ # ββ Stage 1: Data Ingestion ββ
β β βββ __init__.py
β β βββ minio_client.py # MinIO upload / download helpers
β β βββ pgn_parser.py # Parse PGN β structured dicts
β β βββ db.py # SQLAlchemy models (PostgreSQL)
β β βββ ingest_pgn.py # End-to-end ingestion entry-point
β β βββ validate.py # Great Expectations quality checks
β β
β βββ preprocessing/ # ββ Stage 2: Preprocessing ββ
β β βββ __init__.py
β β βββ tabular_features.py # XGBoost feature extraction
β β βββ sequence_tokenizer.py # SAN β integer token IDs
β β βββ dataset.py # PyTorch Dataset / DataLoader
β β βββ splits.py # Train / Val / Test splitting
β β
β βββ models/ # ββ Stage 3: Model Development ββ
β β βββ __init__.py
β β βββ mlflow_utils.py # MLflow tracking helpers
β β βββ xgboost_trainer.py # XGBoost classifier training
β β βββ transformer_model.py # GPT-style causal LM architecture
β β βββ transformer_trainer.py # Training loop with LoRA/QLoRA
β β βββ registry.py # Save / load model artefacts
β β
β βββ serving/ # ββ Stage 4: Deployment & Serving ββ
β β βββ __init__.py
β β βββ app.py # FastAPI application & routes
β β βββ schemas.py # Pydantic request / response models
β β βββ inference.py # Inference helpers
β β
β βββ monitoring/ # ββ Stage 5: Model Monitoring ββ
β βββ __init__.py
β βββ generate_report.py # Evidently drift report generator
β βββ drift_detectors.py # Custom sequence-level drift checks
β
βββ airflow/ # βββ ORCHESTRATION βββ
β βββ dags/
β βββ chess_ingestion_dag.py # Daily: ingest β validate β features
β βββ chess_training_dag.py # Manual: train models β monitor
β
βββ infra/ # βββ INFRASTRUCTURE βββ
β βββ docker-compose.yml # MinIO, PostgreSQL, MLflow, Airflow, API
β
βββ tests/ # βββ TEST SUITE βββ
β βββ __init__.py
β βββ test_pgn_parser.py
β βββ test_tokenizer.py
β βββ test_features.py
β βββ test_transformer.py
β βββ test_api.py
β βββ test_drift.py
β
βββ notebooks/ # βββ EXPLORATION βββ
β βββ .gitkeep
β
βββ data/ # βββ DATA (git-ignored) βββ
β βββ raw/ # Raw PGN files
β βββ interim/ # Intermediate artefacts
β βββ processed/ # Model-ready features & splits
β
βββ artefacts/ # βββ MODEL ARTEFACTS (git-ignored) βββ
β βββ models/ # Trained model checkpoints
β βββ tokenizers/ # Fitted tokenizer JSON
β
βββ reports/ # βββ MONITORING REPORTS (git-ignored) βββ
βββ .gitkeep
git clone https://github.com/nabin2004/chessMania.git
cd chessMania
# Create virtual environment
python -m venv .venv && source .venv/bin/activate
# Install with dev extras
make devcp .env.example .env
# Edit .env with your credentialsmake infra-up # Starts MinIO, PostgreSQL, MLflow, Airflow| Service | URL |
|---|---|
| MinIO Console | http://localhost:9001 |
| PostgreSQL | localhost:5432 |
| MLflow UI | http://localhost:5000 |
| Airflow UI | http://localhost:8080 |
| API | http://localhost:8000 |
Place Lichess PGN files in data/raw/, then:
make ingest # Parse PGNs β PostgreSQL + MinIO
make validate # Run Great Expectations checksmake train-xgb # XGBoost baseline
make train-transformer # Transformer sequence modelmake serve # Start FastAPI at localhost:8000make monitor # Generate Evidently drift reports{ "status": "ok", "models_loaded": ["xgboost", "transformer"] }Predict game outcome probabilities from tabular features.
{
"white_elo": 1500,
"black_elo": 1450,
"time_control": "300+3",
"eco": "B20",
"moves_played": 10
}β { "white_win": 0.52, "draw": 0.28, "black_win": 0.20, "predicted_result": "1-0" }
Suggest next moves from a partial game sequence.
{
"moves": ["e4", "e5", "Nf3"],
"num_suggestions": 3,
"temperature": 1.0
}β { "suggestions": [{"move": "Nc6", "probability": 0.35}, ...] }
make test # pytest with coverage
make lint # ruff + mypyββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA LAYER β
β Lichess PGNs βββΊ MinIO βββΊ Airflow ETL βββΊ PostgreSQL β
β β β
β Great Expectations (validation) β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β PREPROCESSING β
β Tabular Features (ELO, ECO, TC) β Sequence Tokenizer (SAN) β
β βββΊ XGBoost feature matrix β βββΊ Integer token IDs β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β MODEL DEVELOPMENT β
β XGBoost Classifier β GPT-style Transformer β
β (Accuracy, F1, AUC) β (Perplexity, MFU, Acc) β
β β + LoRA / QLoRA adapters β
β MLflow Tracking β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β SERVING β
β FastAPI + Uvicorn + Docker β
β /predict/win (XGBoost) β /predict/next-move (Transformer) β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β MONITORING β
β Evidently AI β
β β’ Tabular drift (ELO distributions, accuracy) β
β β’ Sequence drift (invalid tokens, structural anomalies) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
MIT β see LICENSE.