Skip to content

Gamsty/FootballMatchPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Football Match Predictor

A full-stack machine learning application that predicts football match outcomes across multiple betting markets. Uses XGBoost classification trained on 9 European leagues with 40,000+ historical matches.

Python Flask React ML Azure Vercel IaC CI/CD

Live Demo

Highlights

  • Hybrid cloud architecture — Vercel for frontend, Azure for backend, ML, and data
  • Container Apps with managed identity — zero-trust auth to Storage, Key Vault, ACR (no secrets in env vars)
  • Infrastructure as Code — full Bicep, az deployment group create rebuilds the entire stack
  • CI/CD via GitHub Actions + OIDC — no long-lived credentials in GitHub
  • MLOps with validation gates — nightly retraining job; new model must hold AUC within 0.02 of production before promotion
  • Observability — Application Insights with structured logs and custom metrics

Features

  • Multi-market predictions — Match result (H/D/A), Double Chance, BTTS, Over/Under 2.5, Half-Time result, Corners, Cards
  • Combo bets — Result+BTTS, Result+O/U, BTTS+O/U combinations with combined probabilities
  • Smart bet recommendations — "Best Bet" (highest edge) and "Safest Bet" (highest probability) with reasoning
  • Accumulator builder — Select bets across matches, calculates combined odds and potential returns
  • Match tagging — High Confidence, Upset Pick, Banker classifications
  • Auto-refresh — Daily fixture updates via APScheduler (06:00 UTC)
  • 9 leagues — Premier League, Championship, La Liga, Bundesliga, Serie A, Ligue 1, Eredivisie, Primeira Liga, Champions League

Architecture

graph TB
    Vercel[Vercel<br/>React + Vite frontend]
    GHA[GitHub Actions<br/>OIDC, no long-lived secrets]

    subgraph Azure
        CA[Container Apps<br/>Flask + gunicorn<br/>scale-to-zero]
        Job[Container Apps Job<br/>nightly retrain<br/>+ AUC validation gate]
        PG[(PostgreSQL Flexible<br/>+ pgBouncer)]
        Blob[(Blob Storage<br/>models/production<br/>models/candidate)]
        KV[Key Vault<br/>DB string, API keys]
        AI[Application Insights<br/>logs + metrics]
        ACR[Container Registry]
    end

    Vercel -->|HTTPS| CA
    GHA -->|OIDC + az acr build| ACR
    GHA -->|az containerapp update| CA
    GHA -->|az deployment group create| Azure
    ACR -->|managed identity pull| CA
    ACR -->|managed identity pull| Job
    CA -->|managed identity| Blob
    CA -->|managed identity| KV
    CA --> PG
    CA --> AI
    Job --> Blob
    Job --> PG
    Job --> AI
    Job -->|hot reload webhook| CA
Loading

Local development

docker compose up --build brings up backend, frontend, and a postgres container in three commands.

Prediction Markets

Market Outcomes Description
Match Result Home / Draw / Away Main 1X2 prediction
Double Chance 1X / X2 / 12 Combined outcome probabilities
BTTS Yes / No Both teams to score
Over/Under 2.5 Over / Under Total goals threshold
Half-Time Result Home / Draw / Away First half prediction
Corners O/U Over / Under 9.5 Total corners prediction
Cards O/U Over / Under 3.5 Total cards prediction
Combos 18 combinations Result+BTTS, Result+O/U, BTTS+O/U

Tech Stack

Backend

  • Python 3.12, Flask 3.1, Gunicorn
  • SQLAlchemy + PostgreSQL (Azure Flexible Server with pgBouncer)
  • XGBoost, Scikit-learn, Pandas, NumPy
  • APScheduler (daily fixture refresh)
  • football-data.org API (match data)
  • azure-identity + azure-storage-blob (model storage via Managed Identity)
  • azure-monitor-opentelemetry (structured logs + custom metrics to Application Insights)

Infrastructure

  • Azure Container Apps (scale-to-zero) + Container Apps Jobs (cron retrain)
  • Azure PostgreSQL Flexible Server (Burstable B1ms with pgBouncer enabled)
  • Azure Blob Storage (models/production/ + models/candidate/)
  • Azure Key Vault (secrets via managed identity references)
  • Azure Container Registry
  • Bicep modules in infra/ — full stack reproducible via az deployment group create
  • GitHub Actions workflows in .github/workflows/ (backend.yml + infra.yml, both via OIDC)

Frontend

  • React 19, Vite 6
  • Tailwind CSS 4
  • React Router 7, Axios

Deployment

  • Frontend: Vercel (auto-deploy from GitHub main)
  • Backend: Azure Container Apps (deployed via Bicep + GitHub Actions OIDC)
  • Database: Azure PostgreSQL Flexible Server (Burstable B1ms)
  • Models: Azure Blob Storage (models/production/, hot-swappable via webhook)
  • Secrets: Azure Key Vault (no secrets in env vars or git)

Local development

Prerequisites

Run locally

# Backend
cd backend
python -m venv venv
source venv/bin/activate    # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env        # fill in DATABASE_URL + FOOTBALL_API_KEY
python wsgi.py              # auto-creates schema if missing

# Frontend (new terminal)
cd frontend
npm install
npm run dev

The app will be available at http://localhost:5174 (frontend) and http://localhost:5000 (API).

Bootstrap data

If your local database is empty, populate it from the CSVs:

cd backend
python src/load_data.py            # historical matches → database
python src/load_external_csv.py    # standings, external league data
python src/feature_engineering.py  # compute ML features
python src/model_training.py       # train models (optional — pre-trained .pkl included)

API Endpoints

Core

Method Endpoint Description
GET /api/health API status and model info
GET /api/teams List all teams
GET /api/teams/:id Team details with stats and recent form
GET /api/competitions List of leagues in database

Predictions

Method Endpoint Description
POST /api/predict Predict match result (H/D/A)
POST /api/predict/markets Full multi-market prediction
GET /api/predictions/upcoming Dashboard — batch predictions for next 3 days
GET /api/predictions/history Past predictions with accuracy

Matches

Method Endpoint Description
GET /api/matches List matches (filter by season, team, status)
GET /api/matches/:id Match details with features
GET /api/matches/upcoming Raw upcoming matches
POST /api/fixtures/refresh Fetch new fixtures from API

Statistics

Method Endpoint Description
GET /api/statistics/overview League-wide match and goal stats
GET /api/statistics/head-to-head H2H record between two teams

Features Used by ML Model

Feature Description
Home/Away Form Win rate over last 5 matches
Goals Scored/Conceded Average per match
Home/Away Win Rate Historical win percentage at venue
Head-to-Head Record Historical record between teams
Days Since Last Match Rest days before the match
Elo Rating Team strength rating
League Standings Current league position and points

Project Structure

FootballMatchPredictor/
├── backend/
│   ├── Dockerfile                  # Multi-stage build for Container Apps
│   ├── models/                     # Pre-trained ML models (.pkl, also in Blob)
│   │   ├── best_model.pkl          # Main XGBoost match result model
│   │   └── multi_market_models.pkl # BTTS, O/U, corners, cards models
│   ├── src/
│   │   ├── app.py                  # Flask API
│   │   ├── database.py             # SQLAlchemy models & DB manager
│   │   ├── data_collection.py      # football-data.org API client
│   │   ├── feature_engineering.py  # Feature computation pipeline
│   │   ├── prediction_service.py   # Multi-market prediction engine
│   │   ├── cache.py                # LRU prediction cache (6h TTL)
│   │   ├── load_data.py            # CSV → database loader
│   │   ├── load_external_csv.py    # External league data loader
│   │   ├── model_training.py       # Model training & evaluation
│   │   ├── model_storage.py        # Blob storage abstraction (Managed Identity)
│   │   └── telemetry.py            # Application Insights wiring
│   ├── jobs/
│   │   ├── Dockerfile              # Retrain job container
│   │   └── retrain.py              # Nightly retrain + AUC validation gate
│   ├── requirements.txt
│   ├── wsgi.py
│   └── gunicorn.conf.py
├── frontend/
│   ├── Dockerfile                  # Multi-stage build (used for parity, prod is Vercel)
│   ├── nginx.conf
│   ├── src/
│   │   ├── components/             # MatchCard, MatchDetail, FilterBar, etc.
│   │   ├── pages/Dashboard.jsx
│   │   ├── services/api.js         # Axios client (uses VITE_API_URL)
│   │   ├── utils/constants.js
│   │   ├── App.jsx
│   │   └── main.jsx
│   ├── vercel.json
│   └── package.json
├── infra/                          # Bicep IaC
│   ├── main.bicep                  # Composes all modules
│   ├── main.parameters.prod.json
│   └── modules/
│       ├── acr.bicep
│       ├── appInsights.bicep
│       ├── containerApp.bicep      # Includes RBAC role assignments
│       ├── containerAppsEnv.bicep
│       ├── keyVault.bicep
│       ├── logAnalytics.bicep
│       ├── postgres.bicep
│       └── storage.bicep
├── .github/workflows/
│   ├── backend.yml                 # Test + build + push + deploy via OIDC
│   └── infra.yml                   # Bicep deploy via OIDC
├── docker-compose.yml              # Local dev convenience (postgres + backend + frontend)
├── docs/
│   └── azure-runbook.md            # Step-by-step Azure deployment commands
├── LICENSE
└── README.md

Deployment

Architecture

Component Service Notes
Frontend Vercel Auto-deploys from main
Backend API Azure Container Apps Scale-to-zero, system-assigned managed identity
Database Azure PostgreSQL Flexible Server Burstable B1ms
Model artifacts Azure Blob Storage models/production/, models/candidate/
Secrets Azure Key Vault DB connection string, API keys, hot-reload token
Container registry Azure Container Registry Basic SKU
Observability Azure Application Insights Structured logs, custom metrics
Nightly retrain Azure Container Apps Job Cron 0 3 * * *, AUC validation gate
CI/CD GitHub Actions (OIDC) No long-lived credentials
IaC Azure Bicep Full stack reproducible from main.bicep

Deploy from scratch

See docs/azure-runbook.md for the complete step-by-step runbook.

Short version:

# 1. Provision everything
az deployment group create `
  --resource-group rg-fotballpred-prod `
  --template-file infra/main.bicep `
  --parameters infra/main.parameters.prod.json `
  --parameters postgresAdminPassword=<...> footballApiKey=<...>

# 2. Build + push backend image
docker build -t fotballpred-backend:v1 ./backend
az acr login --name acrfotballpredprod
docker tag fotballpred-backend:v1 acrfotballpredprod.azurecr.io/fotballpred-backend:latest
docker push acrfotballpredprod.azurecr.io/fotballpred-backend:latest

# 3. Migrate DB + upload models (one-time)
pg_restore ...
az storage blob upload ...

# 4. Vercel: set VITE_API_URL=https://<container-app-fqdn>/api, redeploy

Continuous deployment

Push to main triggers .github/workflows/backend.yml:

  1. Lint + test against ephemeral Postgres
  2. Build Docker image, push to ACR with commit SHA tag
  3. Update Container App revision
  4. Smoke-test /api/health

Author

Adrian@Gamsty

Acknowledgments

  • Football-Data.org for the football data API
  • Data from 9 European leagues (2021–2025 seasons)

Built with Python, React, and machine learning.

About

A full-stack ML application that predicts football match outcomes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors