<a href="https://colab.research.google.com/github/mikeogunmakin/river-medway-trading/blob/main/research/202511_automate_brier_score_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Automate Brier Score Analysis**

**Aim:** Automate brier score analysis to quantify the predictability of a football league.

Pseudo code logic:

Function takes a dataset as input and returns the Brier score for Home, Draw, and Win markets.

1. Create binary outcome columns for each market (Home, Draw, Win).
2. Calculate the implied probabilities from the market odds:
   - If Betfair Exchange market odds are unavailable, standardise and use the average odds instead.
3. For each market (Home, Draw, Win), calculate the squared error between the implied probabilities and actual outcomes.
4. Compute the mean of the squared errors for each market to obtain the Brier score.
5. Return the Brier score for each market.

In [28]:
import numpy as np
import pandas as pd

In [56]:
def brier_score(df):

  cols = df.columns

  cols_to_keep = ['Date', 'Div', 'HomeTeam', 'AwayTeam', 'FTR']

  if 'BFEH' in cols:
    df['BFEH_prob_normalised'] = (1/df['BFEH'])/ ((1/df['BFEH']) + (1/df['BFED']) + (1/df['BFEA']))
    df['BFED_prob_normalised'] = (1/df['BFED']) / ((1/df['BFEH']) + (1/df['BFED']) + (1/df['BFEA']))
    df['BFEA_prob_normalised'] = (1/df['BFEA']) / ((1/df['BFEH']) + (1/df['BFED']) + (1/df['BFEA']))

    cols_to_keep.append('BFEH')
    cols_to_keep.append('BFED')
    cols_to_keep.append('BFEA')
    cols_to_keep.append('BFEH_prob_normalised')
    cols_to_keep.append('BFED_prob_normalised')
    cols_to_keep.append('BFEA_prob_normalised')


  else:
    df['BFEH_prob_normalised'] = (1/df['AvgH']) / ((1/df['AvgH']) + (1/df['AvgD']) + (1/df['AvgA']))
    df['BFED_prob_normalised'] = (1/df['AvgD']) / ((1/df['AvgH']) + (1/df['AvgD']) + (1/df['AvgA']))
    df['BFEA_prob_normalised'] = (1/df['AvgA']) / ((1/df['AvgH']) + (1/df['AvgD']) + (1/df['AvgA']))

    cols_to_keep.append('AvgH')
    cols_to_keep.append('AvgD')
    cols_to_keep.append('AvgA')
    cols_to_keep.append('BFEH_prob_normalised')
    cols_to_keep.append('BFED_prob_normalised')
    cols_to_keep.append('BFEA_prob_normalised')

  df = df[cols_to_keep].copy()

  df['HomeWinOutcome'] = np.where(df['FTR'] == 'H', 1, 0)
  df['DrawOutcome'] = np.where(df['FTR'] == 'D', 1, 0)
  df['AwayWinOutcome'] = np.where(df['FTR'] == 'A', 1, 0)

  df['HomeWin_SqrdError'] = (df['BFEH_prob_normalised'] - df['HomeWinOutcome'])**2
  df['Draw_SqrdError'] = (df['BFED_prob_normalised'] - df['DrawOutcome'])**2
  df['AwayWin_SqrdError'] = (df['BFEA_prob_normalised'] - df['AwayWinOutcome'])**2


  home_win_bs = df['HomeWin_SqrdError'].sum() / df.shape[0]
  draw_bs = df['Draw_SqrdError'].sum() / df.shape[0]
  away_win_bs = df['AwayWin_SqrdError'].sum() / df.shape[0]

  print(f'''home win brier score:{np.round(home_win_bs,3)}
            \naway win brier score:{np.round(away_win_bs,3)}
            \ndraw brier score:{np.round(draw_bs,3)}''')




In [26]:
import pandas as pd
import numpy as np

df = pd.read_csv('/content/drive/MyDrive/RMT/Data/data/2024 football season/E0.csv')


In [57]:
brier_score(df)

home win brier score:0.204 
            
away win brier score:0.191
            
draw brier score:0.184


0.5555555555555556