A full-stack machine learning application that predicts football match outcomes across multiple betting markets. Uses XGBoost classification trained on 9 European leagues with 40,000+ historical matches.
- Frontend: football-match-predictor-pearl.vercel.app
- API: Azure Container Apps (scale-to-zero — first request after idle takes ~5–10 sec to spin up)
- Hybrid cloud architecture — Vercel for frontend, Azure for backend, ML, and data
- Container Apps with managed identity — zero-trust auth to Storage, Key Vault, ACR (no secrets in env vars)
- Infrastructure as Code — full Bicep,
az deployment group createrebuilds the entire stack - CI/CD via GitHub Actions + OIDC — no long-lived credentials in GitHub
- MLOps with validation gates — nightly retraining job; new model must hold AUC within 0.02 of production before promotion
- Observability — Application Insights with structured logs and custom metrics
- Multi-market predictions — Match result (H/D/A), Double Chance, BTTS, Over/Under 2.5, Half-Time result, Corners, Cards
- Combo bets — Result+BTTS, Result+O/U, BTTS+O/U combinations with combined probabilities
- Smart bet recommendations — "Best Bet" (highest edge) and "Safest Bet" (highest probability) with reasoning
- Accumulator builder — Select bets across matches, calculates combined odds and potential returns
- Match tagging — High Confidence, Upset Pick, Banker classifications
- Auto-refresh — Daily fixture updates via APScheduler (06:00 UTC)
- 9 leagues — Premier League, Championship, La Liga, Bundesliga, Serie A, Ligue 1, Eredivisie, Primeira Liga, Champions League
graph TB
Vercel[Vercel<br/>React + Vite frontend]
GHA[GitHub Actions<br/>OIDC, no long-lived secrets]
subgraph Azure
CA[Container Apps<br/>Flask + gunicorn<br/>scale-to-zero]
Job[Container Apps Job<br/>nightly retrain<br/>+ AUC validation gate]
PG[(PostgreSQL Flexible<br/>+ pgBouncer)]
Blob[(Blob Storage<br/>models/production<br/>models/candidate)]
KV[Key Vault<br/>DB string, API keys]
AI[Application Insights<br/>logs + metrics]
ACR[Container Registry]
end
Vercel -->|HTTPS| CA
GHA -->|OIDC + az acr build| ACR
GHA -->|az containerapp update| CA
GHA -->|az deployment group create| Azure
ACR -->|managed identity pull| CA
ACR -->|managed identity pull| Job
CA -->|managed identity| Blob
CA -->|managed identity| KV
CA --> PG
CA --> AI
Job --> Blob
Job --> PG
Job --> AI
Job -->|hot reload webhook| CA
docker compose up --build brings up backend, frontend, and a postgres container in three commands.
| Market | Outcomes | Description |
|---|---|---|
| Match Result | Home / Draw / Away | Main 1X2 prediction |
| Double Chance | 1X / X2 / 12 | Combined outcome probabilities |
| BTTS | Yes / No | Both teams to score |
| Over/Under 2.5 | Over / Under | Total goals threshold |
| Half-Time Result | Home / Draw / Away | First half prediction |
| Corners O/U | Over / Under 9.5 | Total corners prediction |
| Cards O/U | Over / Under 3.5 | Total cards prediction |
| Combos | 18 combinations | Result+BTTS, Result+O/U, BTTS+O/U |
- Python 3.12, Flask 3.1, Gunicorn
- SQLAlchemy + PostgreSQL (Azure Flexible Server with pgBouncer)
- XGBoost, Scikit-learn, Pandas, NumPy
- APScheduler (daily fixture refresh)
- football-data.org API (match data)
- azure-identity + azure-storage-blob (model storage via Managed Identity)
- azure-monitor-opentelemetry (structured logs + custom metrics to Application Insights)
- Azure Container Apps (scale-to-zero) + Container Apps Jobs (cron retrain)
- Azure PostgreSQL Flexible Server (Burstable B1ms with pgBouncer enabled)
- Azure Blob Storage (
models/production/+models/candidate/) - Azure Key Vault (secrets via managed identity references)
- Azure Container Registry
- Bicep modules in
infra/— full stack reproducible viaaz deployment group create - GitHub Actions workflows in
.github/workflows/(backend.yml + infra.yml, both via OIDC)
- React 19, Vite 6
- Tailwind CSS 4
- React Router 7, Axios
- Frontend: Vercel (auto-deploy from GitHub
main) - Backend: Azure Container Apps (deployed via Bicep + GitHub Actions OIDC)
- Database: Azure PostgreSQL Flexible Server (Burstable B1ms)
- Models: Azure Blob Storage (
models/production/, hot-swappable via webhook) - Secrets: Azure Key Vault (no secrets in env vars or git)
- Python 3.12, Node.js 20+, PostgreSQL 16, API key from football-data.org
# Backend
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env # fill in DATABASE_URL + FOOTBALL_API_KEY
python wsgi.py # auto-creates schema if missing
# Frontend (new terminal)
cd frontend
npm install
npm run devThe app will be available at http://localhost:5174 (frontend) and http://localhost:5000 (API).
If your local database is empty, populate it from the CSVs:
cd backend
python src/load_data.py # historical matches → database
python src/load_external_csv.py # standings, external league data
python src/feature_engineering.py # compute ML features
python src/model_training.py # train models (optional — pre-trained .pkl included)| Method | Endpoint | Description |
|---|---|---|
| GET | /api/health |
API status and model info |
| GET | /api/teams |
List all teams |
| GET | /api/teams/:id |
Team details with stats and recent form |
| GET | /api/competitions |
List of leagues in database |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/predict |
Predict match result (H/D/A) |
| POST | /api/predict/markets |
Full multi-market prediction |
| GET | /api/predictions/upcoming |
Dashboard — batch predictions for next 3 days |
| GET | /api/predictions/history |
Past predictions with accuracy |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/matches |
List matches (filter by season, team, status) |
| GET | /api/matches/:id |
Match details with features |
| GET | /api/matches/upcoming |
Raw upcoming matches |
| POST | /api/fixtures/refresh |
Fetch new fixtures from API |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/statistics/overview |
League-wide match and goal stats |
| GET | /api/statistics/head-to-head |
H2H record between two teams |
| Feature | Description |
|---|---|
| Home/Away Form | Win rate over last 5 matches |
| Goals Scored/Conceded | Average per match |
| Home/Away Win Rate | Historical win percentage at venue |
| Head-to-Head Record | Historical record between teams |
| Days Since Last Match | Rest days before the match |
| Elo Rating | Team strength rating |
| League Standings | Current league position and points |
FootballMatchPredictor/
├── backend/
│ ├── Dockerfile # Multi-stage build for Container Apps
│ ├── models/ # Pre-trained ML models (.pkl, also in Blob)
│ │ ├── best_model.pkl # Main XGBoost match result model
│ │ └── multi_market_models.pkl # BTTS, O/U, corners, cards models
│ ├── src/
│ │ ├── app.py # Flask API
│ │ ├── database.py # SQLAlchemy models & DB manager
│ │ ├── data_collection.py # football-data.org API client
│ │ ├── feature_engineering.py # Feature computation pipeline
│ │ ├── prediction_service.py # Multi-market prediction engine
│ │ ├── cache.py # LRU prediction cache (6h TTL)
│ │ ├── load_data.py # CSV → database loader
│ │ ├── load_external_csv.py # External league data loader
│ │ ├── model_training.py # Model training & evaluation
│ │ ├── model_storage.py # Blob storage abstraction (Managed Identity)
│ │ └── telemetry.py # Application Insights wiring
│ ├── jobs/
│ │ ├── Dockerfile # Retrain job container
│ │ └── retrain.py # Nightly retrain + AUC validation gate
│ ├── requirements.txt
│ ├── wsgi.py
│ └── gunicorn.conf.py
├── frontend/
│ ├── Dockerfile # Multi-stage build (used for parity, prod is Vercel)
│ ├── nginx.conf
│ ├── src/
│ │ ├── components/ # MatchCard, MatchDetail, FilterBar, etc.
│ │ ├── pages/Dashboard.jsx
│ │ ├── services/api.js # Axios client (uses VITE_API_URL)
│ │ ├── utils/constants.js
│ │ ├── App.jsx
│ │ └── main.jsx
│ ├── vercel.json
│ └── package.json
├── infra/ # Bicep IaC
│ ├── main.bicep # Composes all modules
│ ├── main.parameters.prod.json
│ └── modules/
│ ├── acr.bicep
│ ├── appInsights.bicep
│ ├── containerApp.bicep # Includes RBAC role assignments
│ ├── containerAppsEnv.bicep
│ ├── keyVault.bicep
│ ├── logAnalytics.bicep
│ ├── postgres.bicep
│ └── storage.bicep
├── .github/workflows/
│ ├── backend.yml # Test + build + push + deploy via OIDC
│ └── infra.yml # Bicep deploy via OIDC
├── docker-compose.yml # Local dev convenience (postgres + backend + frontend)
├── docs/
│ └── azure-runbook.md # Step-by-step Azure deployment commands
├── LICENSE
└── README.md
| Component | Service | Notes |
|---|---|---|
| Frontend | Vercel | Auto-deploys from main |
| Backend API | Azure Container Apps | Scale-to-zero, system-assigned managed identity |
| Database | Azure PostgreSQL Flexible Server | Burstable B1ms |
| Model artifacts | Azure Blob Storage | models/production/, models/candidate/ |
| Secrets | Azure Key Vault | DB connection string, API keys, hot-reload token |
| Container registry | Azure Container Registry | Basic SKU |
| Observability | Azure Application Insights | Structured logs, custom metrics |
| Nightly retrain | Azure Container Apps Job | Cron 0 3 * * *, AUC validation gate |
| CI/CD | GitHub Actions (OIDC) | No long-lived credentials |
| IaC | Azure Bicep | Full stack reproducible from main.bicep |
See docs/azure-runbook.md for the complete step-by-step runbook.
Short version:
# 1. Provision everything
az deployment group create `
--resource-group rg-fotballpred-prod `
--template-file infra/main.bicep `
--parameters infra/main.parameters.prod.json `
--parameters postgresAdminPassword=<...> footballApiKey=<...>
# 2. Build + push backend image
docker build -t fotballpred-backend:v1 ./backend
az acr login --name acrfotballpredprod
docker tag fotballpred-backend:v1 acrfotballpredprod.azurecr.io/fotballpred-backend:latest
docker push acrfotballpredprod.azurecr.io/fotballpred-backend:latest
# 3. Migrate DB + upload models (one-time)
pg_restore ...
az storage blob upload ...
# 4. Vercel: set VITE_API_URL=https://<container-app-fqdn>/api, redeployPush to main triggers .github/workflows/backend.yml:
- Lint + test against ephemeral Postgres
- Build Docker image, push to ACR with commit SHA tag
- Update Container App revision
- Smoke-test
/api/health
Adrian — @Gamsty
- Football-Data.org for the football data API
- Data from 9 European leagues (2021–2025 seasons)
Built with Python, React, and machine learning.