## ðŸ“ˆ Predicting Premier League Final Positions Using Betting Odds, Probabilistic Modelling & Simulation

**Competition:** English Premier League 2025/26  
**Purpose:** Estimate probabilities of final league positions using betting market information and simulation  
**Methods:** Odds-implied probabilities, Monte Carlo simulation, scenario analysis  
**Author:** [Victoria Friss de Kereki](https://www.linkedin.com/in/victoria-friss-de-kereki/)  
**Medium Articles:**  
[Predicting Premier League Final Positions Using Betting Odds, Probabilistic Modelling & Simulation](https://medium.com/p/2720ec335c3c)  
[Building a Probabilistic Premier League Simulator in Python](https://medium.com/p/2720ec335c3chttps://medium.com/@vickyfrissdekereki/building-a-probabilistic-premier-league-simulator-in-python-34b5248f81b9)

---

**Notebook first written:** `17/01/2026`  
**Last updated:** `27/01/2026`  

> This notebook develops a probabilistic framework to predict final Premier League final positions using betting odds as market-based expectations.
>
> Betting odds are transformed into implied probabilities and adjusted for bookmaker margin. These probabilities are then used to simulate the remainder of the season via Monte Carlo methods, generating distributions over final points totals and league positions.
>
> The analysis focuses on estimating the likelihood of key outcomes such as title wins, top-four finishes, relegation, and mid-table placements. Results are presented at team level with uncertainty intervals, and the framework can be extended to incorporate form, fixture difficulty, or alternative predictive inputs beyond betting markets.


<div style="text-align: left;">
    <img src="Images and others/Predicting Premier League Final Positions Using Betting Odds, Probabilistic Modelling & Simulation.png" alt="Predicting Premier League Final Positions Using Betting Odds, Probabilistic Modelling & Simulation" width="600">
</di>
>

In [21]:
# Core
from datetime import datetime, timedelta
import os

# Data manipulation
import numpy as np
import pandas as pd

# APIs & environment
import requests
from dotenv import load_dotenv

# Statistics
from scipy.stats import poisson

# Visualisation
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

# Nicer printing of tables, no wrapping
pd.set_option("display.width", None)
pd.set_option("display.max_columns", None)
pd.set_option("display.expand_frame_repr", False)

## 1. Premier League Final Standings (ESPN Scraping)
##### Using the ESPN scraper I built in my previous project.

## <span style="background-color: yellow;">FIRST CHANGE: Loop to run through the 5 URLs and generate 5 datasets (1 per league).</span>
>



In [4]:
import pandas as pd

year = 2025  # current season start year

leagues = {
    "ENG.1": "premierleague_england",
    "ITA.1": "seriea_italy",
    "ESP.1": "laliga_spain",
    "GER.1": "bundesliga_germany",
    "FRA.1": "ligue1_france",
}

for league_code, df_name in leagues.items():
    url = f"https://www.espn.com/soccer/standings/_/league/{league_code}/season/{year}"
    tables = pd.read_html(url)

    teams_raw = tables[0]
    stats = tables[1]

    teams = pd.DataFrame()
    teams["position"] = teams_raw.iloc[:, 0].str.extract(r"^(\d+)").astype(int)
    teams["team"] = (
        teams_raw.iloc[:, 0]
        .str.replace(r"^\d+", "", regex=True)
        .str.replace(r"^[A-Z]{2,3}", "", regex=True)
        .str.strip()
    )

    stats.columns = ["gp", "w", "d", "l", "gf", "ga", "gd", "pts"]
    stats = stats.apply(
        lambda c: c.astype(str)
                  .str.replace("+", "", regex=False)
                  .astype(int)
    )

    globals()[df_name] = pd.concat([teams, stats], axis=1)

In [22]:
print("\nPremier League (England)")
print(premierleague_england.head(3))

print("\nSerie A (Italy)")
print(seriea_italy.head(3))

print("\nLa Liga (Spain)")
print(laliga_spain.head(3))

print("\nBundesliga (Germany)")
print(bundesliga_germany.head(3))

print("\nLigue 1 (France)")
print(ligue1_france.head(3))


Premier League (England)
   position             team  gp   w  d  l  gf  ga  gd  pts
0         1          Arsenal  25  17  5  3  49  17  32   56
1         2  Manchester City  24  14  5  5  49  23  26   47
2         3      Aston Villa  25  14  5  6  36  27   9   47

Serie A (Italy)
   position            team  gp   w  d  l  gf  ga  gd  pts
0         1  Internazionale  23  18  1  4  52  19  33   55
1         2        AC Milan  23  14  8  1  38  17  21   50
2         3          Napoli  24  15  4  5  36  23  13   49

La Liga (Spain)
   position             team  gp   w  d  l  gf  ga  gd  pts
0         1        Barcelona  23  19  1  3  63  23  40   58
1         2      Real Madrid  22  17  3  2  47  18  29   54
2         3  AtlÃ©tico Madrid  22  13  6  3  38  17  21   45

Bundesliga (Germany)
   position               team  gp   w  d  l  gf  ga  gd  pts
0         1      Bayern Munich  20  16  3  1  74  18  56   51
1         2  Borussia Dortmund  21  14  6  1  43  20  23   48
2         3    

## 2. Get betting odds using API

In [8]:
# Load variables from API_KEY.env
load_dotenv("API_KEY.env")

API_KEY = os.getenv("ODDS_DATA_API_KEY")

if API_KEY is None:
    raise ValueError("API_KEY not found. Check API_KEY.env")

print("API key loaded successfully")

API key loaded successfully


In [9]:
import requests

API_KEY = API_KEY  # assuming already defined

leagues = {
    "soccer_epl": "odds_premierleague_england",
    "soccer_italy_serie_a": "odds_seriea_italy",
    "soccer_spain_la_liga": "odds_laliga_spain",
    "soccer_germany_bundesliga": "odds_bundesliga_germany",
    "soccer_france_ligue_one": "odds_ligue1_france",
}

base_url = "https://api.the-odds-api.com/v4/sports/{}/odds"

params = {
    "apiKey": API_KEY,
    "regions": "uk",
    "markets": "h2h",
    "oddsFormat": "decimal",
    "dateFormat": "iso",
    "days": 365
}

for sport_key, var_name in leagues.items():
    url = base_url.format(sport_key)

    response = requests.get(url, params=params)
    response.raise_for_status()

    globals()[var_name] = response.json()

Premier League (England): 22
Serie A (Italy): 17
La Liga (Spain): 17
Bundesliga (Germany): 11
Ligue 1 (France): 20


In [10]:
print("Premier League (England):", len(odds_premierleague_england))
print("Serie A (Italy):", len(odds_seriea_italy))
print("La Liga (Spain):", len(odds_laliga_spain))
print("Bundesliga (Germany):", len(odds_bundesliga_germany))
print("Ligue 1 (France):", len(odds_ligue1_france))

Premier League (England): 22
Serie A (Italy): 17
La Liga (Spain): 17
Bundesliga (Germany): 11
Ligue 1 (France): 20


In [11]:
def flatten_odds(data):
    rows = []

    for match in data:
        match_id = match["id"]
        home = match["home_team"]
        away = match["away_team"]
        time = match["commence_time"]

        for book in match["bookmakers"]:
            bookmaker = book["title"]

            # Find head-to-head (h2h) market. Find the market where key == 'h2h' (win/draw/win odds). If not found, skip this bookmaker.
            h2h = next((m for m in book["markets"] if m["key"] == "h2h"), None)
            if not h2h:
                continue

            outcomes = {o["name"]: o["price"] for o in h2h["outcomes"]}

            rows.append({
                "match_id": match_id,
                "commence_time": time,
                "home_team": home,
                "away_team": away,
                "bookmaker": bookmaker,
                "home_odds": outcomes.get(home),
                "draw_odds": outcomes.get("Draw"),
                "away_odds": outcomes.get(away),
            })

    return pd.DataFrame(rows)

In [12]:
# Flatten odds into DataFrames
df_premierleague_england = flatten_odds(odds_premierleague_england)
df_seriea_italy = flatten_odds(odds_seriea_italy)
df_laliga_spain = flatten_odds(odds_laliga_spain)
df_bundesliga_germany = flatten_odds(odds_bundesliga_germany)
df_ligue1_france = flatten_odds(odds_ligue1_france)

In [23]:
print("\nPremier League (England)")
print(df_premierleague_england.head(3))

print("\nSerie A (Italy)")
print(df_seriea_italy.head(3))

print("\nLa Liga (Spain)")
print(df_laliga_spain.head(3))

print("\nBundesliga (Germany)")
print(df_bundesliga_germany.head(3))

print("\nLigue 1 (France)")
print(df_ligue1_france.head(3))


Premier League (England)
                           match_id         commence_time                 home_team       away_team    bookmaker  home_odds  draw_odds  away_odds
0  a7f9683fe58c4fc6a5ac52396f279456  2026-02-08T14:00:00Z  Brighton and Hove Albion  Crystal Palace  Unibet (UK)       2.00        3.6       3.75
1  a7f9683fe58c4fc6a5ac52396f279456  2026-02-08T14:00:00Z  Brighton and Hove Albion  Crystal Palace      Sky Bet       1.95        3.5       3.75
2  a7f9683fe58c4fc6a5ac52396f279456  2026-02-08T14:00:00Z  Brighton and Hove Albion  Crystal Palace  Paddy Power       1.95        3.4       3.75

Serie A (Italy)
                           match_id         commence_time home_team away_team     bookmaker  home_odds  draw_odds  away_odds
0  14cea9dda59eac8cbb063c8d777171bd  2026-02-08T11:30:00Z   Bologna     Parma  William Hill       1.65        3.7        5.0
1  14cea9dda59eac8cbb063c8d777171bd  2026-02-08T11:30:00Z   Bologna     Parma      888sport       1.61        3.7        5.

In [14]:
def bookmaker_implied_probs(df):
    # Convert odds to implied probabilities per bookmaker
    df = df.assign(
        p_home_raw=1 / df["home_odds"],
        p_draw_raw=1 / df["draw_odds"],
        p_away_raw=1 / df["away_odds"],
    )

    # Remove bookmaker margin (normalise)
    total = (
        df["p_home_raw"] +
        df["p_draw_raw"] +
        df["p_away_raw"]
    )

    df = df.assign(
        p_home_book=df["p_home_raw"] / total,
        p_draw_book=df["p_draw_raw"] / total,
        p_away_book=df["p_away_raw"] / total,
    )

    # Average normalised probabilities across bookmakers
    betting_odds_avg = (
        df.groupby(["match_id", "home_team", "away_team"], as_index=False)
          .agg(
              p_home_book=("p_home_book", "mean"),
              p_draw_book=("p_draw_book", "mean"),
              p_away_book=("p_away_book", "mean"),
          )
    )

    # Keep only required fields
    betting_odds_avg = betting_odds_avg[
        [
            "home_team",
            "away_team",
            "p_home_book",
            "p_draw_book",
            "p_away_book",
        ]
    ]

    return betting_odds_avg

In [15]:
betting_odds_premierleague_england = bookmaker_implied_probs(df_premierleague_england)
betting_odds_seriea_italy = bookmaker_implied_probs(df_seriea_italy)
betting_odds_laliga_spain = bookmaker_implied_probs(df_laliga_spain)
betting_odds_bundesliga_germany = bookmaker_implied_probs(df_bundesliga_germany)
betting_odds_ligue1_france = bookmaker_implied_probs(df_ligue1_france)

In [24]:
print("\nPremier League (England)")
print(betting_odds_premierleague_england.head(3))

print("\nSerie A (Italy)")
print(betting_odds_seriea_italy.head(3))

print("\nLa Liga (Spain)")
print(betting_odds_laliga_spain.head(3))

print("\nBundesliga (Germany)")
print(betting_odds_bundesliga_germany.head(3))

print("\nLigue 1 (France)")
print(betting_odds_ligue1_france.head(3))


Premier League (England)
        home_team                away_team  p_home_book  p_draw_book  p_away_book
0  Crystal Palace                  Burnley     0.580818     0.248314     0.170867
1  Crystal Palace  Wolverhampton Wanderers     0.509145     0.275321     0.215534
2     Aston Villa             Leeds United     0.517140     0.260411     0.222449

Serie A (Italy)
     home_team away_team  p_home_book  p_draw_book  p_away_book
0      Bologna     Parma     0.567562     0.252250     0.180187
1  Inter Milan  Juventus     0.449615     0.283706     0.266680
2         Pisa  AC Milan     0.154256     0.237874     0.607870

La Liga (Spain)
     home_team      away_team  p_home_book  p_draw_book  p_away_book
0       Getafe     Villarreal     0.327233     0.311560     0.361207
1   Villarreal       Espanyol     0.559892     0.237987     0.202121
2  Real Madrid  Real Sociedad     0.691161     0.180069     0.128771

Bundesliga (Germany)
       home_team       away_team  p_home_book  p_draw_book

## 3. Get fixtures for upcoming EPL games

In [25]:
# Load variables from API_KEY.env
load_dotenv("API_KEY.env")

API_KEY = os.getenv("FOOTBALL_DATA_API_KEY")

if API_KEY is None:
    raise ValueError("API_KEY not found. Check API_KEY.env")

print("API key loaded successfully")

API key loaded successfully


In [28]:
competitions = {
    "PL": "fixtures_premierleague_england",
    "SA": "fixtures_seriea_italy",
    "PD": "fixtures_laliga_spain",
    "BL1": "fixtures_bundesliga_germany",
    "FL1": "fixtures_ligue1_france",
}

headers = {
    "X-Auth-Token": API_KEY
}

today = datetime.utcnow().date()
end_of_season = today + timedelta(days=365)

params = {
    "status": "SCHEDULED",
    "dateFrom": today.isoformat(),
    "dateTo": end_of_season.isoformat()
}

for comp_code, df_name in competitions.items():
    url = f"https://api.football-data.org/v4/competitions/{comp_code}/matches"

    response = requests.get(url, headers=headers, params=params)
    response.raise_for_status()

    data = response.json()
    fixtures = data["matches"]

    df_fixtures = pd.DataFrame(fixtures)

    df_fixtures_clean = df_fixtures[
        ["utcDate", "status", "homeTeam", "awayTeam"]
    ].copy()  # copy avoids SettingWithCopyWarning

    # Extract team names
    df_fixtures_clean["homeTeam"] = df_fixtures_clean["homeTeam"].apply(lambda x: x["name"])
    df_fixtures_clean["awayTeam"] = df_fixtures_clean["awayTeam"].apply(lambda x: x["name"])

    globals()[df_name] = df_fixtures_clean

  today = datetime.utcnow().date()


In [29]:
print("Premier League (England):", len(fixtures_premierleague_england))
print("Serie A (Italy):", len(fixtures_seriea_italy))
print("La Liga (Spain):", len(fixtures_laliga_spain))
print("Bundesliga (Germany):", len(fixtures_bundesliga_germany))
print("Ligue 1 (France):", len(fixtures_ligue1_france))

Premier League (England): 131
Serie A (Italy): 147
La Liga (Spain): 156
Bundesliga (Germany): 120
Ligue 1 (France): 122


In [31]:
print("Premier League (England):", fixtures_premierleague_england.head(3))
print("Serie A (Italy):", fixtures_seriea_italy.head(3))
print("La Liga (Spain):", fixtures_laliga_spain.head(3))
print("Bundesliga (Germany):", fixtures_bundesliga_germany.head(3))
print("Ligue 1 (France):", fixtures_ligue1_france.head(3))

Premier League (England):                 utcDate status                   homeTeam            awayTeam
0  2026-02-08T14:00:00Z  TIMED  Brighton & Hove Albion FC   Crystal Palace FC
1  2026-02-08T16:30:00Z  TIMED               Liverpool FC  Manchester City FC
2  2026-02-10T19:30:00Z  TIMED                 Chelsea FC     Leeds United FC
Serie A (Italy):                 utcDate status            homeTeam                  awayTeam
0  2026-02-08T11:30:00Z  TIMED     Bologna FC 1909         Parma Calcio 1913
1  2026-02-08T14:00:00Z  TIMED            US Lecce            Udinese Calcio
2  2026-02-08T17:00:00Z  TIMED  US Sassuolo Calcio  FC Internazionale Milano
La Liga (Spain):                 utcDate status                 homeTeam             awayTeam
0  2026-02-08T13:00:00Z  TIMED         Deportivo AlavÃ©s            Getafe CF
1  2026-02-08T15:15:00Z  TIMED            Athletic Club           Levante UD
2  2026-02-08T17:30:00Z  TIMED  Club AtlÃ©tico de Madrid  Real Betis BalompiÃ©
Bundeslig

## 4. Get this season (2025/26) and last season (2024/25) results

In [33]:
competitions = {
    "PL": "premierleague_england",
    "SA": "seriea_italy",
    "PD": "laliga_spain",
    "BL1": "bundesliga_germany",
    "FL1": "ligue1_france",
}

seasons = [2025, 2024]  # finished seasons you want

headers = {
    "X-Auth-Token": API_KEY
}

for comp_code, league_name in competitions.items():
    for season in seasons:
        url = f"https://api.football-data.org/v4/competitions/{comp_code}/matches"
        params = {
            "season": season,
            "status": "FINISHED"
        }

        response = requests.get(url, headers=headers, params=params)
        response.raise_for_status()

        matches = response.json()["matches"]

        clean_rows = []
        for m in matches:
            clean_rows.append({
                "utcDate": m["utcDate"],
                "matchday": m["matchday"],
                "status": m["status"],
                "homeTeam": m["homeTeam"]["name"],
                "awayTeam": m["awayTeam"]["name"],
                "homeGoals": m["score"]["fullTime"]["home"],
                "awayGoals": m["score"]["fullTime"]["away"],
                "winner": m["score"]["winner"],
            })

        df_clean = pd.DataFrame(clean_rows)

        globals()[f"past_matches_{league_name}_{season}_clean"] = df_clean

In [36]:
for league in [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]:
    for season in [2025, 2024]:
        df = globals()[f"past_matches_{league}_{season}_clean"]
        print(f"\n{league.replace('_', ' ').title()} â€“ Season {season}")
        print(df.tail(2))


Premierleague England â€“ Season 2025
                  utcDate  matchday    status                    homeTeam      awayTeam  homeGoals  awayGoals     winner
246  2026-02-07T15:00:00Z        25  FINISHED  Wolverhampton Wanderers FC    Chelsea FC          1          3  AWAY_TEAM
247  2026-02-07T17:30:00Z        25  FINISHED         Newcastle United FC  Brentford FC          2          3  AWAY_TEAM

Premierleague England â€“ Season 2024
                  utcDate  matchday    status                    homeTeam                   awayTeam  homeGoals  awayGoals     winner
378  2025-05-25T15:00:00Z        38  FINISHED        Tottenham Hotspur FC  Brighton & Hove Albion FC          1          4  AWAY_TEAM
379  2025-05-25T15:00:00Z        38  FINISHED  Wolverhampton Wanderers FC               Brentford FC          1          1       DRAW

Seriea Italy â€“ Season 2025
                  utcDate  matchday    status        homeTeam    awayTeam  homeGoals  awayGoals     winner
231  2026-02-07T17:0

## 5. Combine and calculate probabilities of W/D/L for each match

In [37]:
leagues = [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]

for league in leagues:
    # Load DataFrames
    df_current = globals()[f"past_matches_{league}_2025_clean"]
    df_prev = globals()[f"past_matches_{league}_2024_clean"]
    df_future = globals()[f"fixtures_{league}"]

    # Combine all past fixtures together
    df_all = pd.concat([df_prev, df_current], ignore_index=True)

    # Store results
    globals()[f"past_matches_{league}_all"] = df_all
    globals()[f"future_matches_{league}"] = df_future

In [38]:
leagues = [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]

for league in leagues:
    df_all = globals()[f"past_matches_{league}_all"].copy()

    # Convert date
    df_all["date"] = pd.to_datetime(df_all["utcDate"])

    # Sort so newer matches get higher weight
    df_all = df_all.sort_values("date").reset_index(drop=True)

    # Add linear weights (oldest â†’ newest)
    df_all["weight"] = np.linspace(1, 2, len(df_all))

    # Store weighted dataset
    globals()[f"past_matches_{league}_weighted"] = df_all

In [40]:
for league in [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]:
    df = globals()[f"past_matches_{league}_weighted"]
    print(f"\n{league.replace('_', ' ').title()} â€“ weighted past matches")
    print(df.tail(2))


Premierleague England â€“ weighted past matches
                  utcDate  matchday    status                    homeTeam      awayTeam  homeGoals  awayGoals     winner                      date    weight
626  2026-02-07T15:00:00Z        25  FINISHED  Wolverhampton Wanderers FC    Chelsea FC          1          3  AWAY_TEAM 2026-02-07 15:00:00+00:00  1.998405
627  2026-02-07T17:30:00Z        25  FINISHED         Newcastle United FC  Brentford FC          2          3  AWAY_TEAM 2026-02-07 17:30:00+00:00  2.000000

Seriea Italy â€“ weighted past matches
                  utcDate  matchday    status        homeTeam    awayTeam  homeGoals  awayGoals     winner                      date    weight
611  2026-02-07T17:00:00Z        24  FINISHED       Genoa CFC  SSC Napoli          2          3  AWAY_TEAM 2026-02-07 17:00:00+00:00  1.998366
612  2026-02-07T19:45:00Z        24  FINISHED  ACF Fiorentina   Torino FC          2          2       DRAW 2026-02-07 19:45:00+00:00  2.000000

Laliga Spa

In [41]:
home_advantage_by_league = {}

for league in [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]:
    df = globals()[f"past_matches_{league}_weighted"]

    home_advantage = df["homeGoals"].mean() - df["awayGoals"].mean()
    home_advantage_by_league[league] = home_advantage

    print(
        f"{league.replace('_', ' ').title()}: "
        f"{home_advantage:.3f}"
    )

Premierleague England: 0.186
Seriea Italy: 0.124
Laliga Spain: 0.332
Bundesliga Germany: 0.181
Ligue1 France: 0.317


In [49]:
leagues = [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]

for league in leagues:
    df_all = globals()[f"past_matches_{league}_weighted"]

    # All teams in the league
    teams = pd.unique(df_all[["homeTeam", "awayTeam"]].values.ravel("K"))

    attack = pd.Series(1.0, index=teams)
    defense = pd.Series(1.0, index=teams)

    team_stats = {}

    for team in teams:
        home_games = df_all[df_all["homeTeam"] == team]
        away_games = df_all[df_all["awayTeam"] == team]

        goals_scored = (
            (home_games["homeGoals"] * home_games["weight"]).sum() +
            (away_games["awayGoals"] * away_games["weight"]).sum()
        )

        goals_against = (
            (home_games["awayGoals"] * home_games["weight"]).sum() +
            (away_games["homeGoals"] * away_games["weight"]).sum()
        )

        matches = home_games["weight"].sum() + away_games["weight"].sum()

        team_stats[team] = {
            "scored": goals_scored / matches,
            "against": goals_against / matches
        }

    # League average goals per team per match
    league_avg_scored = (
        df_all["homeGoals"].mean() + df_all["awayGoals"].mean()
    ) / 2

    for team in teams:
        attack[team] = team_stats[team]["scored"] / league_avg_scored
        defense[team] = team_stats[team]["against"] / league_avg_scored

    # Store outputs
    globals()[f"attack_{league}"] = attack
    globals()[f"defense_{league}"] = defense
    globals()[f"league_avg_scored_{league}"] = league_avg_scored

ðŸ”¥ Summary

This function:
+ Calculates expected goals for each team
+ Uses Poisson distribution to compute goal probabilities
+ Converts score probabilities into match outcome probabilities
+ Returns probabilities for:
++ home win
++ draw
++ away win

The Poisson distribution models the number of goals a team scores in a match based on an expected goal rate (Î»). Using the formula \(P(X=k)=e^{-\lambda}\lambda^k/k!\), it calculates the probability of scoring 0, 1, 2, â€¦ goals, where Î» is estimated from team attack/defense strengths and league averages. In the model, I compute separate Poisson probabilities for home and away goals, then combine them to get the probabilities of each possible scoreline and therefore the probabilities of a home win, draw, or away win.


In [53]:
def extract_team_name(x):
    if isinstance(x, dict):
        return x.get("name")
    return x

for league in leagues:
    fixtures = globals()[f"fixtures_{league}"]

    df = pd.DataFrame(fixtures)[
        ["utcDate", "status", "homeTeam", "awayTeam"]
    ]

    df["homeTeam"] = df["homeTeam"].apply(extract_team_name)
    df["awayTeam"] = df["awayTeam"].apply(extract_team_name)

    globals()[f"df_fixtures_clean_{league}"] = df

for league in leagues:
    print(league)
    print(globals()[f"df_fixtures_clean_{league}"][["homeTeam", "awayTeam"]].head(2))


premierleague_england
                    homeTeam            awayTeam
0  Brighton & Hove Albion FC   Crystal Palace FC
1               Liverpool FC  Manchester City FC
seriea_italy
          homeTeam           awayTeam
0  Bologna FC 1909  Parma Calcio 1913
1         US Lecce     Udinese Calcio
laliga_spain
           homeTeam    awayTeam
0  Deportivo AlavÃ©s   Getafe CF
1     Athletic Club  Levante UD
bundesliga_germany
            homeTeam             awayTeam
0         1. FC KÃ¶ln           RB Leipzig
1  FC Bayern MÃ¼nchen  TSG 1899 Hoffenheim
ligue1_france
      homeTeam              awayTeam
0     OGC Nice          AS Monaco FC
1  Le Havre AC  RC Strasbourg Alsace


In [43]:
# Calculate probabilities for each future match

def match_probabilities(home, away):
    # expected goals
    exp_home = np.exp(np.log(league_avg_scored) + np.log(attack[home]) + np.log(defense[away]) + home_advantage)
    exp_away = np.exp(np.log(league_avg_scored) + np.log(attack[away]) + np.log(defense[home]))

    # compute probabilities up to 6 goals (I tested this number, and 6 produces the smallest rmse when compare to the bookmaker odds)
    max_goals = 6
    p_home = poisson.pmf(range(max_goals + 1), exp_home)
    p_away = poisson.pmf(range(max_goals + 1), exp_away)

    # result probabilities
    p_win = 0
    p_draw = 0
    p_loss = 0

    for i in range(max_goals + 1):
        for j in range(max_goals + 1):
            prob = p_home[i] * p_away[j]
            if i > j:
                p_win += prob
            elif i == j:
                p_draw += prob
            else:
                p_loss += prob

    return p_win, p_draw, p_loss

In [54]:
def make_match_probabilities(league):
    attack = globals()[f"attack_{league}"]
    defense = globals()[f"defense_{league}"]
    league_avg_scored = globals()[f"league_avg_scored_{league}"]
    home_advantage = globals()[f"home_advantage_{league}"]

    def match_probabilities(home, away):
        exp_home = np.exp(
            np.log(league_avg_scored)
            + np.log(attack[home])
            + np.log(defense[away])
            + home_advantage
        )

        exp_away = np.exp(
            np.log(league_avg_scored)
            + np.log(attack[away])
            + np.log(defense[home])
        )

        max_goals = 6
        p_home = poisson.pmf(range(max_goals + 1), exp_home)
        p_away = poisson.pmf(range(max_goals + 1), exp_away)

        p_win = p_draw = p_loss = 0

        for i in range(max_goals + 1):
            for j in range(max_goals + 1):
                prob = p_home[i] * p_away[j]
                if i > j:
                    p_win += prob
                elif i == j:
                    p_draw += prob
                else:
                    p_loss += prob

        return p_win, p_draw, p_loss

    return match_probabilities

In [55]:
leagues = [
    "premierleague_england",
    "seriea_italy",
    "laliga_spain",
    "bundesliga_germany",
    "ligue1_france",
]

for league in leagues:
    fixtures = globals()[f"fixtures_{league}"]

    df = pd.DataFrame(fixtures)[
        ["utcDate", "status", "homeTeam", "awayTeam"]
    ]

    df["homeTeam"] = df["homeTeam"].apply(lambda x: x["name"])
    df["awayTeam"] = df["awayTeam"].apply(lambda x: x["name"])

    globals()[f"df_fixtures_clean_{league}"] = df


TypeError: string indices must be integers, not 'str'

## 6. Compare calculated probabilities to bookmaker ones

In [23]:
unique_bet_home = betting_odds_avg["home_team"].unique()
unique_model_home = df_odds["homeTeam"].unique()

In [24]:
print(unique_bet_home)
print(unique_model_home)

['Arsenal' 'Brighton and Hove Albion' 'Tottenham Hotspur' 'Leeds United'
 'Fulham' 'Newcastle United' 'Sunderland' 'Manchester United' 'Liverpool'
 'Burnley' 'Bournemouth' 'Aston Villa' 'Chelsea' 'Wolverhampton Wanderers'
 'Nottingham Forest']
['Brighton & Hove Albion FC' 'Leeds United FC'
 'Wolverhampton Wanderers FC' 'Chelsea FC' 'Liverpool FC' 'Aston Villa FC'
 'Manchester United FC' 'Nottingham Forest FC' 'Tottenham Hotspur FC'
 'Sunderland AFC' 'AFC Bournemouth' 'Arsenal FC' 'Burnley FC' 'Fulham FC'
 'Newcastle United FC' 'Everton FC' 'West Ham United FC'
 'Crystal Palace FC' 'Manchester City FC' 'Brentford FC']


In [25]:
def normalize_team(name):
    name = name.lower()
    name = name.replace(" fc", "")
    name = name.replace(" afc", "")
    name = name.replace("&", "and")
    name = name.replace("afc ", "")   # <--- this removes AFC from start
    name = name.strip()
    return name


In [26]:
df_odds["home_norm"] = df_odds["homeTeam"].apply(normalize_team)
df_odds["away_norm"] = df_odds["awayTeam"].apply(normalize_team)

betting_odds_avg["home_norm"] = betting_odds_avg["home_team"].apply(normalize_team)
betting_odds_avg["away_norm"] = betting_odds_avg["away_team"].apply(normalize_team)


In [27]:
unique_model_norm = df_odds["home_norm"].unique()
unique_bet_norm = betting_odds_avg["home_norm"].unique()

set(unique_model_norm) == set(unique_bet_norm)

False

In [28]:
df_compare = df_odds.merge(
    betting_odds_avg,
    left_on=["home_norm", "away_norm"],
    right_on=["home_norm", "away_norm"],
    how="inner"
)

print("Matched rows:", len(df_compare))
df_compare.head()

Matched rows: 20


Unnamed: 0,utcDate,homeTeam,awayTeam,p_home_win,p_draw,p_away_win,home_norm,away_norm,home_team,away_team,p_home_book,p_draw_book,p_away_book
0,2026-01-31T15:00:00Z,Brighton & Hove Albion FC,Everton FC,0.460272,0.255145,0.283408,brighton and hove albion,everton,Brighton and Hove Albion,Everton,0.518143,0.2587,0.223157
1,2026-01-31T15:00:00Z,Leeds United FC,Arsenal FC,0.165923,0.203962,0.62462,leeds united,arsenal,Leeds United,Arsenal,0.155331,0.230877,0.613792
2,2026-01-31T15:00:00Z,Wolverhampton Wanderers FC,AFC Bournemouth,0.264794,0.218404,0.512047,wolverhampton wanderers,bournemouth,Wolverhampton Wanderers,Bournemouth,0.299504,0.26492,0.435577
3,2026-01-31T17:30:00Z,Chelsea FC,West Ham United FC,0.696241,0.168421,0.122479,chelsea,west ham united,Chelsea,West Ham United,0.630857,0.211025,0.158118
4,2026-01-31T20:00:00Z,Liverpool FC,Newcastle United FC,0.511205,0.215414,0.26798,liverpool,newcastle united,Liverpool,Newcastle United,0.53469,0.236652,0.228658


In [29]:
df_compare["diff_home"] = df_compare["p_home_win"] - df_compare["p_home_book"]
df_compare["diff_draw"] = df_compare["p_draw"] - df_compare["p_draw_book"]
df_compare["diff_away"] = df_compare["p_away_win"] - df_compare["p_away_book"]

df_compare[["homeTeam", "awayTeam", "diff_home", "diff_draw", "diff_away"]].head()

Unnamed: 0,homeTeam,awayTeam,diff_home,diff_draw,diff_away
0,Brighton & Hove Albion FC,Everton FC,-0.057872,-0.003554,0.060251
1,Leeds United FC,Arsenal FC,0.010592,-0.026915,0.010828
2,Wolverhampton Wanderers FC,AFC Bournemouth,-0.03471,-0.046515,0.07647
3,Chelsea FC,West Ham United FC,0.065384,-0.042604,-0.035639
4,Liverpool FC,Newcastle United FC,-0.023486,-0.021237,0.039323


In [30]:
rmse_home = np.sqrt(np.mean((df_compare["p_home_win"] - df_compare["p_home_book"])**2))
rmse_draw = np.sqrt(np.mean((df_compare["p_draw"] - df_compare["p_draw_book"])**2))
rmse_away = np.sqrt(np.mean((df_compare["p_away_win"] - df_compare["p_away_book"])**2))

rmse_home, rmse_draw, rmse_away

(0.05925278573953581, 0.030526851783584653, 0.05932469044834738)

In [31]:
rmse_total = np.sqrt(np.mean(
    (df_compare["p_home_win"] - df_compare["p_home_book"])**2 +
    (df_compare["p_draw"] - df_compare["p_draw_book"])**2 +
    (df_compare["p_away_win"] - df_compare["p_away_book"])**2
))

rmse_total

0.0892311615664871

> Note: RMSE_TOTAL varies often, as bookmaker odds vary. However, it's stayed between 0.075 and 0.090.

In [32]:
np.mean([rmse_home, rmse_draw, rmse_away])

0.049701442657155946

In [33]:
df_compare["abs_diff"] = (
    abs(df_compare["diff_home"]) +
    abs(df_compare["diff_draw"]) +
    abs(df_compare["diff_away"])
)

df_compare.sort_values("abs_diff", ascending=False).head(10)[
    ["homeTeam", "awayTeam", "diff_home", "diff_draw", "diff_away"]
]


Unnamed: 0,homeTeam,awayTeam,diff_home,diff_draw,diff_away
6,Manchester United FC,Fulham FC,-0.16086,0.016143,0.142005
11,Manchester United FC,Tottenham Hotspur FC,-0.099628,-0.016132,0.110864
13,Arsenal FC,Sunderland AFC,-0.107756,0.058122,0.047464
2,Wolverhampton Wanderers FC,AFC Bournemouth,-0.03471,-0.046515,0.07647
3,Chelsea FC,West Ham United FC,0.065384,-0.042604,-0.035639
15,Fulham FC,Everton FC,-0.057264,-0.011225,0.067889
10,Leeds United FC,Nottingham Forest FC,-0.027085,-0.040441,0.065611
16,Wolverhampton Wanderers FC,Chelsea FC,-0.034194,-0.033325,0.062051
7,Nottingham Forest FC,Crystal Palace FC,-0.051625,-0.010138,0.060921
17,Newcastle United FC,Brentford FC,-0.031201,-0.032065,0.058541


## 7. Replace my estimates probabilities with the ones I have from odds, creating my final match probabilities

In [34]:
df_odds.head(2)

Unnamed: 0,utcDate,homeTeam,awayTeam,p_home_win,p_draw,p_away_win,home_norm,away_norm
0,2026-01-31T15:00:00Z,Brighton & Hove Albion FC,Everton FC,0.460272,0.255145,0.283408,brighton and hove albion,everton
1,2026-01-31T15:00:00Z,Leeds United FC,Arsenal FC,0.165923,0.203962,0.62462,leeds united,arsenal


In [35]:
betting_odds_avg.head(2)

Unnamed: 0,home_team,away_team,p_home_book,p_draw_book,p_away_book,home_norm,away_norm
0,Arsenal,Sunderland,0.757418,0.167565,0.075017,arsenal,sunderland
1,Brighton and Hove Albion,Everton,0.518143,0.2587,0.223157,brighton and hove albion,everton


In [36]:
df_final_probabilities = df_odds.merge(
    betting_odds_avg,
    left_on=["home_norm", "away_norm"],
    right_on=["home_norm", "away_norm"],
    how="left"
)

In [37]:
df_final_probabilities = df_final_probabilities[[
    "utcDate",
    "homeTeam",
    "awayTeam",
    "p_home_win",
    "p_draw",
    "p_away_win",
    "p_home_book",
    "p_draw_book",
    "p_away_book",
]]

df_final_probabilities

Unnamed: 0,utcDate,homeTeam,awayTeam,p_home_win,p_draw,p_away_win,p_home_book,p_draw_book,p_away_book
0,2026-01-31T15:00:00Z,Brighton & Hove Albion FC,Everton FC,0.460272,0.255145,0.283408,0.518143,0.258700,0.223157
1,2026-01-31T15:00:00Z,Leeds United FC,Arsenal FC,0.165923,0.203962,0.624620,0.155331,0.230877,0.613792
2,2026-01-31T15:00:00Z,Wolverhampton Wanderers FC,AFC Bournemouth,0.264794,0.218404,0.512047,0.299504,0.264920,0.435577
3,2026-01-31T17:30:00Z,Chelsea FC,West Ham United FC,0.696241,0.168421,0.122479,0.630857,0.211025,0.158118
4,2026-01-31T20:00:00Z,Liverpool FC,Newcastle United FC,0.511205,0.215414,0.267980,0.534690,0.236652,0.228658
...,...,...,...,...,...,...,...,...,...
145,2026-05-24T15:00:00Z,Liverpool FC,Brentford FC,0.564003,0.196983,0.229043,,,
146,2026-05-24T15:00:00Z,Manchester City FC,Aston Villa FC,0.575021,0.215270,0.205210,,,
147,2026-05-24T15:00:00Z,Nottingham Forest FC,AFC Bournemouth,0.417729,0.236620,0.343237,,,
148,2026-05-24T15:00:00Z,Tottenham Hotspur FC,Everton FC,0.419258,0.258239,0.321481,,,


In [38]:
df_final_probabilities["p_home_final"] = np.where(
    df_final_probabilities["p_home_book"].notna(),
    df_final_probabilities["p_home_book"],
    df_final_probabilities["p_home_win"]
)

df_final_probabilities["p_draw_final"] = np.where(
    df_final_probabilities["p_draw_book"].notna(),
    df_final_probabilities["p_draw_book"],
    df_final_probabilities["p_draw"]
)

df_final_probabilities["p_away_final"] = np.where(
    df_final_probabilities["p_away_book"].notna(),
    df_final_probabilities["p_away_book"],
    df_final_probabilities["p_away_win"]
)

In [39]:
print("Used betting odds:", df_final_probabilities["p_home_book"].notna().sum())
print("Used model:", df_final_probabilities["p_home_book"].isna().sum())


Used betting odds: 20
Used model: 130


In [40]:
df_final_probabilities = df_final_probabilities[[
    "utcDate",
    "homeTeam",
    "awayTeam",
    "p_home_final",
    "p_draw_final",
    "p_away_final"
]]

In [41]:
df_final_probabilities

Unnamed: 0,utcDate,homeTeam,awayTeam,p_home_final,p_draw_final,p_away_final
0,2026-01-31T15:00:00Z,Brighton & Hove Albion FC,Everton FC,0.518143,0.258700,0.223157
1,2026-01-31T15:00:00Z,Leeds United FC,Arsenal FC,0.155331,0.230877,0.613792
2,2026-01-31T15:00:00Z,Wolverhampton Wanderers FC,AFC Bournemouth,0.299504,0.264920,0.435577
3,2026-01-31T17:30:00Z,Chelsea FC,West Ham United FC,0.630857,0.211025,0.158118
4,2026-01-31T20:00:00Z,Liverpool FC,Newcastle United FC,0.534690,0.236652,0.228658
...,...,...,...,...,...,...
145,2026-05-24T15:00:00Z,Liverpool FC,Brentford FC,0.564003,0.196983,0.229043
146,2026-05-24T15:00:00Z,Manchester City FC,Aston Villa FC,0.575021,0.215270,0.205210
147,2026-05-24T15:00:00Z,Nottingham Forest FC,AFC Bournemouth,0.417729,0.236620,0.343237
148,2026-05-24T15:00:00Z,Tottenham Hotspur FC,Everton FC,0.419258,0.258239,0.321481


In [42]:
df_final_probabilities["homeTeam"].unique()

array(['Brighton & Hove Albion FC', 'Leeds United FC',
       'Wolverhampton Wanderers FC', 'Chelsea FC', 'Liverpool FC',
       'Aston Villa FC', 'Manchester United FC', 'Nottingham Forest FC',
       'Tottenham Hotspur FC', 'Sunderland AFC', 'AFC Bournemouth',
       'Arsenal FC', 'Burnley FC', 'Fulham FC', 'Newcastle United FC',
       'Everton FC', 'West Ham United FC', 'Crystal Palace FC',
       'Manchester City FC', 'Brentford FC'], dtype=object)

In [43]:
name_map = {
    "Aston Villa FC": "Aston Villa",
    "Brighton & Hove Albion FC": "Brighton & Hove Albion",
    "AFC Bournemouth": "AFC Bournemouth",   # keep as is
    "Bournemouth": "AFC Bournemouth",
    "Sunderland AFC": "Sunderland",
    "Newcastle United FC": "Newcastle United",
    "Manchester City FC": "Manchester City",
    "Manchester United FC": "Manchester United",
    "West Ham United FC": "West Ham United",
    "Wolverhampton Wanderers FC": "Wolverhampton Wanderers",
    "Tottenham Hotspur FC": "Tottenham Hotspur",
    "Crystal Palace FC": "Crystal Palace",
    "Brentford FC": "Brentford",
    "Everton FC": "Everton",
    "Leeds United FC": "Leeds United",
    "Chelsea FC": "Chelsea",
    "Liverpool FC": "Liverpool",
    "Nottingham Forest FC": "Nottingham Forest",
    "Burnley FC": "Burnley",
    "Fulham FC": "Fulham",
    "Arsenal FC": "Arsenal"
}

df_final_probabilities["home_team_norm"] = df_final_probabilities["homeTeam"].replace(name_map)
df_final_probabilities["away_team_norm"] = df_final_probabilities["awayTeam"].replace(name_map)

premierleague_england["team_norm"] = premierleague_england["team"].replace({
    "Brighton & Hove Albion": "Brighton & Hove Albion",
    "AFC Bournemouth": "AFC Bournemouth"
})


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_final_probabilities["home_team_norm"] = df_final_probabilities["homeTeam"].replace(name_map)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_final_probabilities["away_team_norm"] = df_final_probabilities["awayTeam"].replace(name_map)


In [44]:
df_simulation = df_final_probabilities.copy()

In [45]:
# Normalize probabilities so they sum to 1
prob_cols = ["p_home_final", "p_draw_final", "p_away_final"]
df_simulation[prob_cols] = df_simulation[prob_cols].div(df_simulation[prob_cols].sum(axis=1), axis=0)

In [46]:
df_simulation.head()

Unnamed: 0,utcDate,homeTeam,awayTeam,p_home_final,p_draw_final,p_away_final,home_team_norm,away_team_norm
0,2026-01-31T15:00:00Z,Brighton & Hove Albion FC,Everton FC,0.518143,0.2587,0.223157,Brighton & Hove Albion,Everton
1,2026-01-31T15:00:00Z,Leeds United FC,Arsenal FC,0.155331,0.230877,0.613792,Leeds United,Arsenal
2,2026-01-31T15:00:00Z,Wolverhampton Wanderers FC,AFC Bournemouth,0.299504,0.26492,0.435577,Wolverhampton Wanderers,AFC Bournemouth
3,2026-01-31T17:30:00Z,Chelsea FC,West Ham United FC,0.630857,0.211025,0.158118,Chelsea,West Ham United
4,2026-01-31T20:00:00Z,Liverpool FC,Newcastle United FC,0.53469,0.236652,0.228658,Liverpool,Newcastle United


## 8. Run simulations to build the Premier League table probabilities

In [47]:
def simulate_once(fixtures, table):
    table_sim = table.copy()

    # Use normalized team name column
    points = dict(zip(table_sim["team_norm"], table_sim["pts"]))

    for _, row in fixtures.iterrows():
        home = row["home_team_norm"]
        away = row["away_team_norm"]

        # choose outcome
        probs = [row["p_home_final"], row["p_draw_final"], row["p_away_final"]]
        outcome = np.random.choice(["H", "D", "A"], p=probs)

        if outcome == "H":
            points[home] += 3
        elif outcome == "D":
            points[home] += 1
            points[away] += 1
        else:
            points[away] += 3

    result_df = table_sim.copy()
    result_df["pts"] = result_df["team_norm"].map(points)

    # sort by points and goal difference
    result_df = result_df.sort_values(["pts", "gd"], ascending=[False, False])
    result_df["position"] = np.arange(1, len(result_df)+1)

    return result_df


def run_simulations(fixtures, table, n_sim=10000):
    position_counts = {team: np.zeros(len(table)) for team in table["team_norm"]}

    for _ in range(n_sim):
        final_table = simulate_once(fixtures, table)

        for _, row in final_table.iterrows():
            position_counts[row["team_norm"]][row["position"]-1] += 1

    pos_df = pd.DataFrame(position_counts, index=np.arange(1, len(table)+1))
    pos_df.index.name = "position"
    return pos_df

In [48]:
# RUN
position_distribution = run_simulations(df_simulation, premierleague_england, n_sim=20000)

In [49]:
position_distribution.index.name = "TEAM"
position_distribution_t = position_distribution.T

In [50]:
position_distribution_pct = position_distribution_t.div(
    position_distribution_t.sum(axis=1),
    axis=0
) * 100


## 9. Preview and present the results graphically

In [51]:
# Build label mapping: "position  team" (extra space for 1-9)
team_labels = (
    premierleague_england[["team", "position"]]
    .set_index("team")["position"]
    .map(lambda pos: f"{pos}{'  ' if pos < 10 else ' '}")
)

# Join position and team name into one label
team_labels = (
    premierleague_england[["team", "position"]]
    .assign(
        label=lambda df: df.apply(
            lambda r: f"{r['position']}{'&nbsp;&nbsp;&nbsp;&nbsp;' if r['position'] < 10 else '&nbsp;&nbsp;'}{r['team']}",
            axis=1
        )
    )
    .set_index("team")["label"]
)


# Apply labels to your table index
position_distribution_pct.index = position_distribution_pct.index.map(team_labels)

# Drop position column if present
position_distribution_pct = position_distribution_pct.drop(columns=["position"], errors="ignore")

# Remove index name
position_distribution_pct.index.name = None


In [52]:
greens = plt.cm.Greens
green_cmap = LinearSegmentedColormap.from_list(
    "Greens_soft",
    greens(np.linspace(0.03, 0.65, 256))
)

vmax = 25

def zero_style(val):
    if val < 0.005:
        return "background-color: white !important;"
    return ""

# ---- transform ONLY for colouring ----
color_data = position_distribution_pct.copy()
color_data = (color_data / vmax).pow(0.65) * vmax

position_distribution_pct.style \
    .background_gradient(
        cmap=green_cmap,
        vmin=0,
        vmax=vmax,
        gmap=color_data,
        axis=None          # ðŸ”‘ THIS FIXES THE ERROR
    ) \
    .applymap(zero_style) \
    .format("{:.2f}%") \
    .set_table_styles([
        {"selector": "th", "props": [
            ("background-color", "#e6edf4"),
            ("color", "#333"),
            ("text-align", "center"),
            ("font-family", "Inter, Roboto, Arial, sans-serif"),
            ("font-size", "13px"),
            ("font-weight", "600")
        ]},

        {"selector": "th.col_heading", "props": [
            ("text-align", "center")
        ]},

        {"selector": "th.row_heading", "props": [
            ("text-align", "left"),
            ("font-size", "13px"),
            ("font-weight", "600"),
            ("white-space", "nowrap"),
            ("max-width", "250px"),
            ("overflow", "hidden"),
            ("text-overflow", "ellipsis")
        ]},

        {"selector": "tr:nth-child(odd) th.row_heading", "props": [
            ("background-color", "#fbfcfe")
        ]},
        {"selector": "tr:nth-child(even) th.row_heading", "props": [
            ("background-color", "#e6edf4")
        ]},

        {"selector": "td", "props": [
            ("text-align", "center"),
            ("font-family", "Inter, Roboto, Arial, sans-serif"),
            ("font-size", "12px"),
            ("font-weight", "500"),
            ("color", "#000")
        ]}
    ])


  .applymap(zero_style) \


TEAM,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1 Arsenal,76.99%,19.56%,3.16%,0.25%,0.03%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%
2 Manchester City,19.07%,54.12%,21.11%,4.41%,0.98%,0.25%,0.05%,0.02%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%
3 Aston Villa,3.77%,22.24%,49.38%,16.16%,5.75%,1.71%,0.76%,0.15%,0.06%,0.01%,0.00%,0.01%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%
4 Manchester United,0.01%,0.27%,2.77%,11.64%,17.87%,20.77%,15.53%,11.03%,7.41%,4.71%,3.31%,2.02%,1.39%,0.78%,0.34%,0.12%,0.03%,0.00%,0.00%,0.00%
5 Chelsea,0.08%,1.81%,11.42%,28.79%,25.29%,14.49%,7.97%,4.61%,2.67%,1.43%,0.73%,0.39%,0.19%,0.08%,0.05%,0.01%,0.00%,0.00%,0.00%,0.00%
6 Liverpool,0.07%,1.90%,10.73%,29.07%,24.82%,15.07%,8.13%,4.54%,2.56%,1.40%,0.84%,0.50%,0.21%,0.12%,0.03%,0.01%,0.01%,0.00%,0.00%,0.00%
7 Fulham,0.00%,0.01%,0.18%,1.38%,3.74%,7.52%,10.12%,12.39%,12.92%,12.67%,11.38%,9.31%,7.21%,5.18%,3.52%,1.75%,0.67%,0.03%,0.00%,0.00%
8 Brentford,0.00%,0.04%,0.39%,2.53%,6.22%,10.56%,14.11%,14.27%,12.83%,11.03%,8.88%,6.93%,5.17%,3.42%,2.17%,1.05%,0.37%,0.02%,0.00%,0.00%
9 Newcastle United,0.00%,0.04%,0.61%,3.37%,7.89%,12.62%,15.53%,14.41%,12.33%,10.04%,7.71%,5.71%,4.38%,2.62%,1.66%,0.80%,0.25%,0.01%,0.00%,0.00%
10 Everton,0.00%,0.00%,0.03%,0.58%,1.75%,3.77%,6.04%,8.56%,10.13%,11.21%,12.15%,12.20%,11.23%,9.38%,7.12%,4.03%,1.69%,0.14%,0.00%,0.00%
