# World Cup 2022 Value Bets Checker With OddsAPI Data

Welcome! I am very exited about this years world cup and I am/will be watching a lot of the matches. I thought it would be fun to also place some small bets while following along however, if possible I would like to be a bit smart about it. 

Spoiler alert - I find that it would be too cumbersome relative to the money made, to follow the process of a positive expected value betting system with risk management according to the Kelly Criterion. I will still display my thoughts here as feedback leading to adjustments might result in opportunities arising.

However for now, it seems like it will be headless gambling for me again at the 2022 World Cup.

<h3>Setting the scene</h3>

My idea is to pull data from around 10 european bookmakers and compare the bookmaker odds from my bookmaker (i.e. Unibet) with the average odds for the remaining bookmakers. Under the assumption that the average odds reflects the true odds, and thereby the true implied probability, one would then have found a "value odds" if there is a meaningful deviation between the two.

<h3>Mathematics</h3>

For a given game the fair odds $O_i$ set by bookmaker $i \in 1,2,...,n$ and the probability $p_i$ can roughly be described by the relation $p_i = \frac{1}{O_i}$. 

Let $O_u$ be the odds from Unibet and $O_i$ the average odds of the $i-1$ remaining bookmakers, giving us a value-bet if the inequality $O_u \cdot p_i > 1$ with the corresponding expected value 

$$
E(V) = p_u(O_i-1)w - (1-p_u)w 
$$

where $w$ denotes the amount played. Very nice, we have now constructed a positive expected value betting strategy! The only problem is that in practice we will of course lose some of our bets, since it is only expected to make us money asymptotically. We will also need a system for managing the risk of our bankroll such that we know the optimal amount to bet given the EV.

Luckily John Larry Kelly, the chain-smoker who died at 41, knows all about how to manage risk as he in 1956 created the formula that should later be called the "Kelly Criterion". The reader is encouraged to read more about this on his own - I especially would recommend looking at the proof. For now, let us just recognize that the optimal fraction $f$ of the wealth we are betting with that should satisfy 

$$
f = \frac{(O_i-1)p_u -(1-p_u)}{O_i-1}
$$

Lets start coding it up!

In [17]:
# Importing our most important library for this

import pandas as pd

# Loading in dataframe

df = pd.read_csv("worldcup2022.csv")

# Taking a look at our dataframe

df

Unnamed: 0,event_name,commence,status,bookmaker,last_update,odd_1,odd_2,odd_draw
0,England_Iran,21/11/2022,Pending,Marathon Bet,20/11/2022,1.32,12.25,4.95
1,England_Iran,21/11/2022,Pending,Pinnacle,20/11/2022,1.34,13.12,4.90
2,England_Iran,21/11/2022,Pending,Betclic,20/11/2022,1.32,11.50,4.85
3,England_Iran,21/11/2022,Pending,888sport,20/11/2022,1.33,12.00,4.90
4,England_Iran,21/11/2022,Pending,William Hill,20/11/2022,1.35,12.00,4.20
...,...,...,...,...,...,...,...,...
450,Serbia_Switzerland,02/12/2022,Pending,William Hill,20/11/2022,2.70,2.62,3.20
451,Serbia_Switzerland,02/12/2022,Pending,Pinnacle,20/11/2022,2.63,2.69,3.30
452,Serbia_Switzerland,02/12/2022,Pending,Marathon Bet,20/11/2022,2.66,2.72,3.35
453,Serbia_Switzerland,02/12/2022,Pending,888sport,20/11/2022,2.85,2.65,3.30


In [18]:
# Clearing unnecessary columns

df = df.drop(["commence", "status", "last_update"], axis=1)

# Viewing updated dataframe

df

Unnamed: 0,event_name,bookmaker,odd_1,odd_2,odd_draw
0,England_Iran,Marathon Bet,1.32,12.25,4.95
1,England_Iran,Pinnacle,1.34,13.12,4.90
2,England_Iran,Betclic,1.32,11.50,4.85
3,England_Iran,888sport,1.33,12.00,4.90
4,England_Iran,William Hill,1.35,12.00,4.20
...,...,...,...,...,...
450,Serbia_Switzerland,William Hill,2.70,2.62,3.20
451,Serbia_Switzerland,Pinnacle,2.63,2.69,3.30
452,Serbia_Switzerland,Marathon Bet,2.66,2.72,3.35
453,Serbia_Switzerland,888sport,2.85,2.65,3.30


In [19]:
# Empty list for unique games

games = []

# Getting unique games appended to above list

for i in df["event_name"]:
    if i not in games:
        games.append(i)

# Getting the average odds as the "fair odds"

odds_1 = []
odds_x = []
odds_2 = []

for i in games:
    odds_1.append(df.loc[df['event_name'] == i, 'odd_1'].mean())
    odds_x.append(df.loc[df['event_name'] == i, 'odd_draw'].mean())
    odds_2.append(df.loc[df['event_name'] == i, 'odd_2'].mean())

# Creating empty list for Unibet odds on game 1

unibet_odds_1 = df.loc[df['bookmaker'] == "Unibet", 'odd_1'].values.tolist()
unibet_odds_x = df.loc[df['bookmaker'] == "Unibet", 'odd_draw'].values.tolist()
unibet_odds_2 = df.loc[df['bookmaker'] == "Unibet", 'odd_2'].values.tolist()

# Defining and appending our implied probabilities

imp_1 = []
imp_x = []
imp_2 = []

for i in odds_1:
    imp_1.append(1/i)

for i in odds_x:
    imp_x.append(1/i)

for i in odds_2:
    imp_2.append(1/i)

# Defining our difference lists that determines if an odds is value

odds_1_diff = list()
for i in range(len(unibet_odds_1)):
    item_1 = unibet_odds_1[i] - odds_1[i]
    odds_1_diff.append(item_1)

odds_x_diff = list()
for i in range(len(unibet_odds_x)):
    item_x = unibet_odds_x[i] - odds_x[i]
    odds_x_diff.append(item_x)

odds_2_diff = list()
for i in range(len(unibet_odds_2)):
    item_2 = unibet_odds_2[i] - odds_2[i]
    odds_2_diff.append(item_2)

# Calculating the Expected value for the different outcomes

ev_1 = list()
for i in range(len(unibet_odds_1)):
    item = imp_1[i]*(unibet_odds_1[i]-1) - (1-imp_1[i])
    ev_1.append(item)

ev_2 = list()
for i in range(len(unibet_odds_1)):
    item = imp_2[i]*(unibet_odds_2[i]-1) - (1-imp_2[i])
    ev_2.append(item)

ev_x = list()
for i in range(len(unibet_odds_1)):
    item = imp_x[i]*(unibet_odds_x[i]-1) - (1-imp_x[i])
    ev_x.append(item)

# Finding the Kelly criterion for optimal betting

kelly_1 = list()
for i in range(len(unibet_odds_1)):
    item = imp_1[i]/1 - (1-imp_1[i])/(odds_1[i]-1)
    kelly_1.append(item)

# Forgetting old dataframe as we are creating a new one

del df

# Gathering our new data in a dataframe through a dictionary

dict = {
    "Match" : games,
    "Odds 1" : odds_1,
    "Odds x" : odds_x,
    "Odds 2" : odds_2,
    "EV 1" : ev_1,
    "EV x" : ev_x,
    "EV 2" : ev_2,
    "Kelly 1" : kelly_1
}

df = pd.DataFrame(dict).round(3)

# downloading our CSV

df.to_csv("WorldcupValueBets.csv")

# Displaying our dataframe 

df

Unnamed: 0,Match,Odds 1,Odds x,Odds 2,EV 1,EV x,EV 2,Kelly 1
0,England_Iran,1.335,4.874,12.437,-0.004,0.005,-0.035,0.0
1,Netherlands_Senegal,1.569,3.989,6.827,-0.006,-0.01,-0.011,-0.0
2,USA_Wales,2.39,3.097,3.427,0.004,-0.015,-0.008,-0.0
3,Argentina_Saudi Arabia,1.164,7.774,21.4,-0.012,0.029,0.075,0.0
4,Denmark_Tunisia,1.487,4.203,7.999,0.002,-0.024,0.0,-0.0
5,Mexico_Poland,2.833,3.06,2.87,-0.012,-0.02,0.01,-0.0
6,Australia_France,11.503,5.683,1.295,0.043,-0.05,0.004,0.0
7,Croatia_Morocco,2.089,3.217,4.084,-0.009,-0.005,0.004,0.0
8,Germany_Japan,1.508,4.572,6.806,-0.005,-0.038,-0.008,0.0
9,Costa Rica_Spain,21.426,7.548,1.17,-0.02,-0.006,0.0,0.0


# Conclusion

Unfortunately, there are no meaningful value bets following from this approach. We do find plenty of bets with a positive expected value however, when we apply proper risk management this suggests that we should not be taking any of these bets at all. Improvements to this model could of course include all relevant parameters that is indicative of a team's relative performance and thereby impliying the probability and thus the "fair" odds of that event. 