World Cup 2026 sports-betting analytics powered by Elo ratings, Poisson modelling, and Monte Carlo simulation.
GoallineIQ is a multi-page Streamlit application that aggregates historical World Cup data (2010–2026), trains a live Elo + Poisson prediction model, and surfaces actionable insights for sports-betting research. It covers the full tournament lifecycle — from pre-match deep dives and value-bet detection to full tournament simulations and multi-bookmaker odds comparison.
| Page | Description |
|---|---|
| Predictions (main) | Live countdown, upcoming match probability cards, value-bet alerts, Elo rankings chart, group standings |
| Match Hub | Season selector, full schedule table, group standings, and a visual knockout-bracket diagram |
| Pre-Match Analysis | Win/Draw/Loss gauge charts, head-to-head record, form guide, expected-goals scoreline distribution |
| Odds Comparison | Multi-bookmaker odds (live via API-Football), implied probability charts, bookmaker overround analysis, value-bet threshold slider |
| Tournament Simulator | Monte Carlo simulation (up to 25 000 runs) of the full 48-team / 12-group 2026 format; confederation filter; results as bar chart, table, or heatmap |
| Statistics | Scoring leaders, team stats, xG analysis, historical WC trends (goals/match, result distributions, all-time wins) |
| Team Deep Dive | Nation profile with Elo gauge, WC match history, playing-style radar, current squad, and head-to-head record vs. any opponent |
| Source | Tournaments | Notes |
|---|---|---|
| Openfootball (GitHub JSON) | 2010, 2014 | Free, no key required |
| BALLDONTLIE FIFA API | 2018, 2022, 2026 | Requires API key; cursor-paginated |
| API-Football v3 | 2026 live | Fixtures, standings, odds, H2H; 100 calls/day free tier |
Data is cached aggressively (24 h for historical, 5 min for live fixtures) to stay within free-tier limits.
This repo includes a GitHub Actions workflow that pulls fresh 2026 World Cup data snapshots on a nightly basis.
- Workflow: .github/workflows/nightly-world-cup-data-refresh.yml
- Snapshot script: scripts/pull_world_cup_data.py
- Output directory: data_files/nightly_snapshots
- Consumption note: docs/worldcup-snapshot-consumption.md
The workflow is scheduled for 03:00 UTC every day, but only performs the snapshot job during the tournament support window:
- Start:
2026-06-04 UTC(one week before kickoff) - End:
2026-07-26 UTC(one week after the final)
This keeps pre-tournament and post-tournament data pulls available without running year-round.
Add these repository secrets in GitHub before enabling the workflow:
BALLDONTLIE_API_KEYAPI_FOOTBALL_KEYTHE_SPORTS_DB_KEY
Each nightly run writes a dated folder under data_files/nightly_snapshots/YYYY-MM-DD/ containing:
matches_all.csvupcoming_matches.csvstandings.csvtop_scorers.csvmanifest.json
You can also trigger the workflow manually with workflow_dispatch from the GitHub Actions UI.
GoallineIQ uses a two-component approach described in the project docs:
- EloRatingSystem — trains on all World Cup results from 2010–2026, applying a K-factor of 32 and a home advantage of +40 Elo points.
- PoissonModel — converts the Elo differential between two teams into expected goals (λ_home, λ_away) using a logistic scaling function.
- Win / Draw / Loss probabilities — computed by summing the full bivariate Poisson scoreline distribution (score cap: 9 goals per side).
- Value-bet detection — compares model-implied probabilities against bookmaker odds to flag edges above a configurable threshold.
The docs recommend adding XGBoost, an ensemble model, and live in-match features in later iterations. These layers are scaffolded but not yet trained.
-
Odds API / OddsPapi not used — no keys for these services were found in
.env. Multi-bookmaker odds are sourced from API-Football; when the API is unavailable, illustrative demo odds are generated for UI demonstration. -
XGBoost (Layer 2) not trained — the project docs recommend starting with the Elo + Poisson baseline and adding model layers incrementally. XGBoost is referenced in the architecture but not yet fit to data.
-
2026 group draw is approximate — the 12 groups (A–L) and their team assignments are hard-coded as a fallback in
utils/models.py(WC2026_GROUPS). Live draw data from BALLDONTLIE will override this when available. -
BALLDONTLIE endpoint structure — the FIFA-specific base URL (
https://fifa.balldontlie.io/api/v1) andAuthorization: <key>header format are assumed from the provider's standard REST patterns, as full endpoint docs were not public at build time. -
TheSportsDB — free public key
123is configured. Team logos from TheSportsDB are not currently rendered; the integration point exists inapi_client.pyfor a future enhancement. -
Playing-style radar — trait scores (Attacking, Defensive, Elo Strength, Tournament Experience, Consistency) are derived from historical WC match data and current Elo ratings. Live club metrics (e.g., from Understat) are not yet integrated.
-
Pre-tournament state — at build time (May 2026) the tournament had not yet started (kick-off June 11, 2026). All pages handle this gracefully with countdown timers and placeholder messages.
| Package | Version | Purpose |
|---|---|---|
| streamlit | ≥ 1.35 | App framework |
| pandas | ≥ 2.0 | Data wrangling |
| numpy | ≥ 1.24 | Numerics |
| scipy | ≥ 1.11 | Poisson distributions |
| xgboost | ≥ 2.0 | ML scaffold (Layer 2) |
| scikit-learn | ≥ 1.3 | Preprocessing utilities |
| plotly | ≥ 5.18 | Interactive charts |
| requests | ≥ 2.31 | API calls |
| python-dotenv | ≥ 1.0 | .env key loading |
| Pillow | ≥ 10.0 | Logo rendering |
This application is for entertainment and research purposes only. Predictions and value-bet alerts do not constitute financial advice. Please gamble responsibly. If you need support, contact the National Problem Gambling Helpline: 1-800-522-4700. Must be 21+ in applicable jurisdictions.
