## **Expected Points**

xPts | AC Milan 4-2 Udinese Serie A 2022/23

Expected Points = (3×𝑃win) + (1×𝑃draw) + (0×𝑃loss)

In [2]:
#imports
import pandas as pd
import numpy as np
from scipy.stats import poisson

##### **Data Retrieval**

- https://understat.readthedocs.io/en/latest/classes/understat.html#the-functions

Understat.get_match_shots(match_id, options=None, **kwargs) <br/>
_Returns a dictionary containing information about shots taken by the players in the given match._

Parameters:	
- fixture_id (int) – A match’s ID.
- options – Options to filter the data by, defaults to None.
- options – dict, optional

Returns: Dictionary containing information about the players who played in the match. | Return type: dict

It returns information about the shots made by players who played in the given match. 

In [2]:
import asyncio
import json
import aiohttp
from understat import Understat

- Inter 1-2 AC Milan, 22/09/2024 | https://understat.com/match/27406

In [None]:
async with aiohttp.ClientSession() as session:
    understat = Understat(session)
    shots = await understat.get_match_shots(27406)
    INTACM_shots = pd.DataFrame(shots)

In [None]:
# save as csv
INTACM_shots.to_csv('../data/INTACM_shots.csv', index=False)

Alternative, use data already collected

##### **Data Preparation**

In [6]:
df_ACMUDI_shots = pd.read_csv('../data/ACMUDI_22-23.csv')

In [7]:
print(df_ACMUDI_shots.shape)
print('')
print(df_ACMUDI_shots.columns)
print('')
print(df_ACMUDI_shots.dtypes)

(23, 14)

Index(['team_id', 'player_id', 'player_name', 'min', 'expected_goals',
       'event_type', 'team_color', 'match_id', 'is_own_goal', 'x', 'y',
       'shot_type', 'situation', 'team_name'],
      dtype='object')

team_id             int64
player_id           int64
player_name        object
min                 int64
expected_goals    float64
event_type         object
team_color         object
match_id            int64
is_own_goal          bool
x                 float64
y                 float64
shot_type          object
situation          object
team_name          object
dtype: object


In [8]:
df_ACMUDI_shots.head()

Unnamed: 0,team_id,player_id,player_name,min,expected_goals,event_type,team_color,match_id,is_own_goal,x,y,shot_type,situation,team_name
0,8600,844504,Rodrigo Becao,2,0.086003,Goal,#907850,3919071,False,102.344825,39.545001,Header,FromCorner,Udinese
1,8564,750027,Brahim Diaz,7,0.05414,AttemptSaved,#302028,3919071,False,95.350879,19.304968,RightFoot,RegularPlay,Milan
2,8564,724371,Theo Hernández,11,0.7884,Goal,#302028,3919071,False,94.0,34.0,LeftFoot,Penalty,Milan
3,8564,265725,Ante Rebic,15,0.075055,Goal,#302028,3919071,False,94.0,32.017501,RightFoot,RegularPlay,Milan
4,8564,848844,Rafael Leao,18,0.033549,Miss,#302028,3919071,False,78.768443,43.300128,LeftFoot,RegularPlay,Milan


##### **Data Modeling**

In [11]:
# aggregate xG values for each team

# xG values for each shot taken by Team A and Team B
ACM_df = df_ACMUDI_shots[df_ACMUDI_shots['team_name'] == 'Milan']
UDI_df = df_ACMUDI_shots[df_ACMUDI_shots['team_name'] == 'Udinese']

ACM_shots_xG = ACM_df['expected_goals']
UDI_shots_xG = UDI_df['expected_goals']

ACM_xG = sum(ACM_shots_xG)
UDI_xG = sum(UDI_shots_xG)

print(f"AC Milan Total xG: {ACM_xG}") 
print(f"Udinese Total xG: {UDI_xG}")

AC Milan Total xG: 2.325864648485184
Udinese Total xG: 0.4874548017978663


In [12]:
# model goal scoring probabilities using poisson distribution

# maximum number of goals to consider
max_goals = 5

# calculate goal probabilities for Team A and Team B
ACM_goal_probs = [poisson.pmf(i, ACM_xG) for i in range(max_goals + 1)]
UDI_goal_probs = [poisson.pmf(i, UDI_xG) for i in range(max_goals + 1)]

In [13]:
# determine match outcome probabilities

# probability matrix for all possible scorelines 
match_probs = np.outer(ACM_goal_probs, UDI_goal_probs)

In [14]:
# calculate the probabilities of Team A winning, drawing, and losing

# probability of Team A winning (Team A scores more than Team B) 
P_win = np.sum(np.tril(match_probs, -1)) 

# probability of a draw (both teams score the same number of goals) 
P_draw = np.sum(np.diag(match_probs)) 

# probability of Team A losing (Team A scores fewer goals than Team B) 
P_loss = np.sum(np.triu(match_probs, 1)) 

print(f"Probability of Win: {P_win:.4f}") 
print(f"Probability of Draw: {P_draw:.4f}") 
print(f"Probability of Loss: {P_loss:.4f}")

Probability of Win: 0.7574
Probability of Draw: 0.1499
Probability of Loss: 0.0612


In [15]:
# calculate expected points

expected_points = (3 * P_win) + (1 * P_draw) 
print(f"Expected Points: {expected_points:.2f}")

Expected Points: 2.42
