# Strategy testing - "Beating the bookies"

The paper
[Beating the bookies with their own numbers - and how the online sports betting market is rigged](https://arxiv.org/abs/1710.02824)
gives a breakdown of an arbitrage approach to consistently profiting off mispriced odds in the betting market. We will test this over the premier league results.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns

In [2]:
df = pd.read_csv('../data/season_16_17.csv')

In [3]:
df

Unnamed: 0,Div,Date,HomeTeam,AwayTeam,FTHG,FTAG,FTR,HTHG,HTAG,HTR,...,BbAv<2.5,BbAH,BbAHh,BbMxAHH,BbAvAHH,BbMxAHA,BbAvAHA,PSCH,PSCD,PSCA
0,E0,13/08/16,Burnley,Swansea,0,1,A,0,0,D,...,1.61,32,-0.25,2.13,2.06,1.86,1.81,2.79,3.16,2.89
1,E0,13/08/16,Crystal Palace,West Brom,0,1,A,0,0,D,...,1.52,33,-0.50,2.07,2.00,1.90,1.85,2.25,3.15,3.86
2,E0,13/08/16,Everton,Tottenham,1,1,D,1,0,H,...,1.77,32,0.25,1.91,1.85,2.09,2.00,3.64,3.54,2.16
3,E0,13/08/16,Hull,Leicester,2,1,H,1,0,H,...,1.67,31,0.25,2.35,2.26,2.03,1.67,4.68,3.50,1.92
4,E0,13/08/16,Man City,Sunderland,2,1,H,1,0,H,...,2.48,34,-1.50,1.81,1.73,2.20,2.14,1.25,6.50,14.50
5,E0,13/08/16,Middlesbrough,Stoke,1,1,D,1,0,H,...,1.53,32,-0.25,1.99,1.93,1.97,1.92,2.20,3.38,3.70
6,E0,13/08/16,Southampton,Watford,1,1,D,0,1,A,...,1.75,33,-0.75,2.16,2.07,1.89,1.80,1.80,3.83,4.91
7,E0,14/08/16,Arsenal,Liverpool,3,4,A,1,1,D,...,1.99,31,-0.50,2.41,2.31,1.81,1.64,2.80,3.44,2.68
8,E0,14/08/16,Bournemouth,Man United,1,3,A,0,1,A,...,1.76,33,0.75,1.80,1.76,2.17,2.11,5.40,3.65,1.78
9,E0,15/08/16,Chelsea,West Ham,2,1,H,0,0,D,...,2.01,33,-1.00,2.20,2.10,1.80,1.76,1.52,4.38,7.45


## The strategy

If we take the set of all odds across bookmakers for a given match, the consensus probability $p_{cons}$ is equal to

$$p_{cons} = \frac{1}{mean(\Omega)}$$

A bet should be placed whenever a bookmaker's odds $\omega$ are more favourable than $p_{cons} - \alpha$ where $\alpha$ is the bookmaker's profit margin adjustment (estimated at 0.05 in the paper). A higher $\alpha$ results in more consistent returns and fewer available bets.

So when the odds for a given result fulfill the equality

$$max(\Omega) > 1 / (p_{cons} - 0.05)$$

We should place a bet.

In [39]:
# bookies odds for 1X2 in 2016/17 season
df_home_win = df[['B365H', 'BWH', 'IWH', 'LBH', 'PSH', 'VCH', 'WHH']]
df_draw = df[['B365D', 'BWD', 'IWD', 'LBD', 'PSD', 'VCD', 'WHD']]
df_away_win = df[['B365A', 'BWA', 'IWA', 'LBA', 'PSA', 'VCA', 'WHA']]

In [54]:
# Do the maximum odds satisfy the inequality?
home_win = 1/(1/df_home_win.mean(axis=1) - 0.05) < df_home_win.max(axis=1)
draw = 1/(1/df_draw.mean(axis=1) - 0.05) < df_draw.max(axis=1)
away_win = 1/(1/df_away_win.mean(axis=1) - 0.05) < df_away_win.max(axis=1)

In [55]:
print('Percentage of games to bet on home_win: ', (home_win.sum() / home_win.shape[0]) * 100)
print('Percentage of games to bet on draw: ', (draw.sum() / draw.shape[0]) * 100)
print('Percentage of games to bet on away_win: ', (away_win.sum() / away_win.shape[0]) * 100)

Percentage of games to bet on home_win:  0.2631578947368421
Percentage of games to bet on draw:  0.0
Percentage of games to bet on away_win:  0.7894736842105263
