# Pythagorean Expectation and the Indian Premier League

The Indian Premier League (IPL) is the biggest cricket competition in the world, which has all of the world's best players in an eight week tournament involving eight teams playing sixty games in total. Each team plays every other team, once at home and then away, and the competition finishes with the four best teams competing in semi-finals and then a final.  

Cricket, like baseball, is a bat and ball game, where teams score runs and the team scoring the highest number of runs is the winner. 

Pythagorean Expectation of a cricket team = (Total Runs Scored)**2/((Total Runs Scored)**2 + (Total Runs Conceeded)**2)

In [76]:
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt
import seaborn as sns

In [77]:
IPL = pd.read_csv('Cricket_data.csv')
print(IPL.columns.tolist())

['season', 'id', 'name', 'short_name', 'description', 'home_team', 'away_team', 'toss_won', 'decision', '1st_inning_score', '2nd_inning_score', 'winner', 'result', 'start_date', 'end_date', 'venue_id', 'venue_name', 'home_captain', 'away_captain', 'pom', 'points', 'super_over', 'home_overs', 'home_runs', 'home_wickets', 'home_boundaries', 'away_overs', 'away_runs', 'away_wickets', 'away_boundaries', 'highlights', 'home_key_batsman', 'home_key_bowler', 'home_playx1', 'away_playx1', 'away_key_batsman', 'away_key_bowler', 'match_days', 'umpire1', 'umpire2', 'tv_umpire', 'referee', 'reserve_umpire']


In [78]:
IPL.head()

Unnamed: 0,season,id,name,short_name,description,home_team,away_team,toss_won,decision,1st_inning_score,...,home_playx1,away_playx1,away_key_batsman,away_key_bowler,match_days,umpire1,umpire2,tv_umpire,referee,reserve_umpire
0,,1370350,Chennai Super Kings v Gujarat Titans,CSK v GT,"Qualifier 1 (N), Indian Premier League at Chen...",CSK,GT,,,,...,,,,,,,,,,
1,,1370351,Lucknow Super Giants v Mumbai Indians,LSG v MI,"Eliminator (N), Indian Premier League at Chenn...",LSG,MI,,,,...,,,,,,,,,,
2,,1370352,TBC v TBC,TBC v TBC,"Qualifier 2 (N), Indian Premier League at Ahme...",TBA,TBA,,,,...,,,,,,,,,,
3,,1370353,TBC v TBC,TBC v TBC,"Final (N), Indian Premier League at Ahmedabad,...",TBA,TBA,,,,...,,,,,,,,,,
4,2023.0,1359544,Royal Challengers Bangalore v Gujarat Titans,RCB v GT,"70th Match (N), Indian Premier League at Benga...",RCB,GT,GT,BOWL FIRST,197/5,...,"Virat Kohli (UKN),Faf du Plessis (UKN),Glenn M...","Wriddhiman Saha (WK),Shubman Gill (UKN),Vijay ...","Shubman Gill,Vijay Shankar","Noor Ahmad,Rashid Khan",21 May 2023 - night match (20-over match),Nitin Menon,Virender Sharma,Tapan Sharma,Javagal Srinath,VM Dhokre


In [79]:
#We will filter out 2023 Data
IPL2023 = IPL[IPL['season'] == 2023.0]

In [80]:
IPL2023

Unnamed: 0,season,id,name,short_name,description,home_team,away_team,toss_won,decision,1st_inning_score,...,home_playx1,away_playx1,away_key_batsman,away_key_bowler,match_days,umpire1,umpire2,tv_umpire,referee,reserve_umpire
4,2023.0,1359544,Royal Challengers Bangalore v Gujarat Titans,RCB v GT,"70th Match (N), Indian Premier League at Benga...",RCB,GT,GT,BOWL FIRST,197/5,...,"Virat Kohli (UKN),Faf du Plessis (UKN),Glenn M...","Wriddhiman Saha (WK),Shubman Gill (UKN),Vijay ...","Shubman Gill,Vijay Shankar","Noor Ahmad,Rashid Khan",21 May 2023 - night match (20-over match),Nitin Menon,Virender Sharma,Tapan Sharma,Javagal Srinath,VM Dhokre
5,2023.0,1359543,Mumbai Indians v Sunrisers Hyderabad,MI v SRH,"69th Match (D/N), Indian Premier League at Mum...",MI,SRH,MI,BOWL FIRST,200/5,...,"Ishan Kishan (WK),Rohit Sharma (UKN),Cameron G...","Vivrant Sharma (AR),Mayank Agarwal (UKN),Heinr...","Mayank Agarwal,Vivrant Sharma","Bhuvneshwar Kumar,Mayank Dagar",21 May 2023 - day/night match (20-over match),KN Ananthapadmanabhan,Rod Tucker,Rohan Pandit,Pankaj Dharmani,Parashar Joshi
6,2023.0,1359542,Kolkata Knight Riders v Lucknow Super Giants,KKR v LSG,"68th Match (N), Indian Premier League at Kolka...",KKR,LSG,KKR,BOWL FIRST,176/8,...,"Jason Roy (UKN),Venkatesh Iyer (AR),Nitish Ran...","Karan Sharma (AR),Quinton de Kock (WK),Prerak ...","Nicholas Pooran,Quinton de Kock","Ravi Bishnoi,Yash Thakur",20 May 2023 - night match (20-over match),Ulhas Gandhe,Jayaraman Madanagopal,Yeshwant Barde,Manu Nayyar,Mohamed Rafi
7,2023.0,1359541,Delhi Capitals v Chennai Super Kings,DC v CSK,"67th Match (D/N), Indian Premier League at Del...",DC,CSK,CSK,BAT FIRST,223/3,...,"Prithvi Shaw (UKN),David Warner (UKN),Phil Sal...","Ruturaj Gaikwad (UKN),Devon Conway (UKN),Shiva...","Devon Conway,Ruturaj Gaikwad","Deepak Chahar,Matheesha Pathirana",20 May 2023 - day/night match (20-over match),Chris Gaffaney,Nikhil Patwardhan,Anil Chaudhary,Sanjay Verma,Mohit Krishnadas
8,2023.0,1359540,Punjab Kings v Rajasthan Royals,PBKS v RR,"66th Match (N), Indian Premier League at Dhara...",PBKS,RR,RR,BOWL FIRST,187/5,...,"Prabhsimran Singh (UKN),Shikhar Dhawan (UKN),A...","Yashasvi Jaiswal (UKN),Jos Buttler (UKN),Devdu...","Devdutt Padikkal,Yashasvi Jaiswal","Navdeep Saini,Adam Zampa",19 May 2023 - night match (20-over match),Nand Kishore,Rod Tucker,Navdeep Singh,Pankaj Dharmani,Parashar Joshi
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,2023.0,1359479,Royal Challengers Bangalore v Mumbai Indians,RCB v MI,"5th Match (N), Indian Premier League at Bengal...",RCB,MI,RCB,BOWL FIRST,171/7,...,"Virat Kohli (UKN),Faf du Plessis (UKN),Dinesh ...","Rohit Sharma (UKN),Ishan Kishan (WK),Cameron G...","Tilak Varma,Nehal Wadhera","Arshad Khan,Cameron Green",02 April 2023 - night match (20-over match),Nitin Menon,Tapan Sharma,Virender Sharma,Javagal Srinath,Abhijit Bengeri
70,2023.0,1359478,Sunrisers Hyderabad v Rajasthan Royals,SRH v RR,"4th Match (D/N), Indian Premier League at Hyde...",SRH,RR,SRH,BOWL FIRST,203/5,...,"Abhishek Sharma (AR),Mayank Agarwal (UKN),Rahu...","Yashasvi Jaiswal (UKN),Jos Buttler (UKN),Sanju...","Sanju Samson,Yashasvi Jaiswal","Yuzvendra Chahal,Trent Boult",02 April 2023 - day/night match (20-over match),KN Ananthapadmanabhan,Rohan Pandit,Navdeep Singh,Narayanan Kutty,Abhijit Bhattacharya
71,2023.0,1359477,Lucknow Super Giants v Delhi Capitals,LSG v DC,"3rd Match (N), Indian Premier League at Luckno...",LSG,DC,DC,BOWL FIRST,193/6,...,"KL Rahul (UKN),Kyle Mayers (AR),Deepak Hooda (...","Prithvi Shaw (UKN),David Warner (UKN),Mitchell...","David Warner,Rilee Rossouw","Khaleel Ahmed,Chetan Sakariya",01 April 2023 - night match (20-over match),Anil Chaudhary,Nikhil Patwardhan,Sadashiv Iyer,Daniel Manohar,Madanagopal Kuppuraj
72,2023.0,1359476,Punjab Kings v Kolkata Knight Riders,PBKS v KKR,"2nd Match (D/N), Indian Premier League at Chan...",PBKS,KKR,KKR,BOWL FIRST,191/5,...,"Prabhsimran Singh (UKN),Shikhar Dhawan (UKN),B...","Mandeep Singh (AR),Rahmanullah Gurbaz (WK),Anu...","Andre Russell,Venkatesh Iyer","Tim Southee,Varun Chakravarthy",01 April 2023 - day/night match (20-over match),Yeshwant Barde,Bruce Oxenford,Jayaraman Madanagopal,Manu Nayyar,Pranav Joshi


In [81]:
IPL2023.columns

Index(['season', 'id', 'name', 'short_name', 'description', 'home_team',
       'away_team', 'toss_won', 'decision', '1st_inning_score',
       '2nd_inning_score', 'winner', 'result', 'start_date', 'end_date',
       'venue_id', 'venue_name', 'home_captain', 'away_captain', 'pom',
       'points', 'super_over', 'home_overs', 'home_runs', 'home_wickets',
       'home_boundaries', 'away_overs', 'away_runs', 'away_wickets',
       'away_boundaries', 'highlights', 'home_key_batsman', 'home_key_bowler',
       'home_playx1', 'away_playx1', 'away_key_batsman', 'away_key_bowler',
       'match_days', 'umpire1', 'umpire2', 'tv_umpire', 'referee',
       'reserve_umpire'],
      dtype='object')

In [82]:
IPL2023.dtypes

season              float64
id                    int64
name                 object
short_name           object
description          object
home_team            object
away_team            object
toss_won             object
decision             object
1st_inning_score     object
2nd_inning_score     object
winner               object
result               object
start_date           object
end_date             object
venue_id              int64
venue_name           object
home_captain         object
away_captain         object
pom                  object
points               object
super_over           object
home_overs          float64
home_runs           float64
home_wickets        float64
home_boundaries     float64
away_overs          float64
away_runs           float64
away_wickets        float64
away_boundaries     float64
highlights           object
home_key_batsman     object
home_key_bowler      object
home_playx1          object
away_playx1          object
away_key_batsman    

In [84]:
IPL2023.head()

Unnamed: 0,season,id,name,short_name,description,home_team,away_team,toss_won,decision,1st_inning_score,...,home_playx1,away_playx1,away_key_batsman,away_key_bowler,match_days,umpire1,umpire2,tv_umpire,referee,reserve_umpire
4,2023.0,1359544,Royal Challengers Bangalore v Gujarat Titans,RCB v GT,"70th Match (N), Indian Premier League at Benga...",RCB,GT,GT,BOWL FIRST,197/5,...,"Virat Kohli (UKN),Faf du Plessis (UKN),Glenn M...","Wriddhiman Saha (WK),Shubman Gill (UKN),Vijay ...","Shubman Gill,Vijay Shankar","Noor Ahmad,Rashid Khan",21 May 2023 - night match (20-over match),Nitin Menon,Virender Sharma,Tapan Sharma,Javagal Srinath,VM Dhokre
5,2023.0,1359543,Mumbai Indians v Sunrisers Hyderabad,MI v SRH,"69th Match (D/N), Indian Premier League at Mum...",MI,SRH,MI,BOWL FIRST,200/5,...,"Ishan Kishan (WK),Rohit Sharma (UKN),Cameron G...","Vivrant Sharma (AR),Mayank Agarwal (UKN),Heinr...","Mayank Agarwal,Vivrant Sharma","Bhuvneshwar Kumar,Mayank Dagar",21 May 2023 - day/night match (20-over match),KN Ananthapadmanabhan,Rod Tucker,Rohan Pandit,Pankaj Dharmani,Parashar Joshi
6,2023.0,1359542,Kolkata Knight Riders v Lucknow Super Giants,KKR v LSG,"68th Match (N), Indian Premier League at Kolka...",KKR,LSG,KKR,BOWL FIRST,176/8,...,"Jason Roy (UKN),Venkatesh Iyer (AR),Nitish Ran...","Karan Sharma (AR),Quinton de Kock (WK),Prerak ...","Nicholas Pooran,Quinton de Kock","Ravi Bishnoi,Yash Thakur",20 May 2023 - night match (20-over match),Ulhas Gandhe,Jayaraman Madanagopal,Yeshwant Barde,Manu Nayyar,Mohamed Rafi
7,2023.0,1359541,Delhi Capitals v Chennai Super Kings,DC v CSK,"67th Match (D/N), Indian Premier League at Del...",DC,CSK,CSK,BAT FIRST,223/3,...,"Prithvi Shaw (UKN),David Warner (UKN),Phil Sal...","Ruturaj Gaikwad (UKN),Devon Conway (UKN),Shiva...","Devon Conway,Ruturaj Gaikwad","Deepak Chahar,Matheesha Pathirana",20 May 2023 - day/night match (20-over match),Chris Gaffaney,Nikhil Patwardhan,Anil Chaudhary,Sanjay Verma,Mohit Krishnadas
8,2023.0,1359540,Punjab Kings v Rajasthan Royals,PBKS v RR,"66th Match (N), Indian Premier League at Dhara...",PBKS,RR,RR,BOWL FIRST,187/5,...,"Prabhsimran Singh (UKN),Shikhar Dhawan (UKN),A...","Yashasvi Jaiswal (UKN),Jos Buttler (UKN),Devdu...","Devdutt Padikkal,Yashasvi Jaiswal","Navdeep Saini,Adam Zampa",19 May 2023 - night match (20-over match),Nand Kishore,Rod Tucker,Navdeep Singh,Pankaj Dharmani,Parashar Joshi


In [85]:
# Aggregate matches played by home teams
home_matches = IPL2023.groupby('home_team').size().reset_index(name='matches_played_home')

# Aggregate matches played by away teams
away_matches = IPL2023.groupby('away_team').size().reset_index(name='matches_played_away')

# Merge the two DataFrames to get total matches played by each team
total_matches = pd.merge(home_matches, away_matches, how='outer', left_on='home_team', right_on='away_team')

# Fill missing values with 0 and sum matches played
total_matches['total_matches_played'] = total_matches['matches_played_home'].fillna(0) + total_matches['matches_played_away'].fillna(0)

# Drop unnecessary columns
total_matches.drop(['matches_played_home', 'matches_played_away', 'away_team'], axis=1, inplace=True)

# Rename the 'home_team' column to 'Team' for clarity
total_matches.rename(columns={'home_team': 'Team'}, inplace=True)

# Aggregate matches won by home teams
home_wins = IPL2023[IPL2023['winner'] == IPL2023['home_team']].groupby('home_team').size().reset_index(name='matches_won_home')

# Aggregate matches won by away teams
away_wins = IPL2023[IPL2023['winner'] == IPL2023['away_team']].groupby('away_team').size().reset_index(name='matches_won_away')

# Merge the two DataFrames to get total matches won by each team
total_wins = pd.merge(home_wins, away_wins, how='outer', left_on='home_team', right_on='away_team')

# Fill missing values with 0 and sum matches won
total_wins['total_matches_won'] = total_wins['matches_won_home'].fillna(0) + total_wins['matches_won_away'].fillna(0)

# Drop unnecessary columns
total_wins.drop(['matches_won_home', 'matches_won_away', 'away_team'], axis=1, inplace=True)

# Rename the 'home_team' column to 'Team' for clarity
total_wins.rename(columns={'home_team': 'Team'}, inplace=True)

# Calculate total runs scored by each team
home_runs = IPL2023.groupby('home_team')['home_runs'].sum().reset_index(name='total_runs_home')
away_runs = IPL2023.groupby('away_team')['away_runs'].sum().reset_index(name='total_runs_away')

# Merge the two DataFrames to get total runs scored by each team
total_runs = pd.merge(home_runs, away_runs, how='outer', left_on='home_team', right_on='away_team')

# Fill missing values with 0 and sum total runs
total_runs['total_runs_scored'] = total_runs['total_runs_home'].fillna(0) + total_runs['total_runs_away'].fillna(0)

# Drop unnecessary columns
total_runs.drop(['total_runs_home', 'total_runs_away', 'away_team'], axis=1, inplace=True)

# Rename the 'home_team' column to 'Team' for clarity
total_runs.rename(columns={'home_team': 'Team'}, inplace=True)

# Calculate total runs allowed by each team
home_runs_allowed = IPL2023.groupby('away_team')['home_runs'].sum().reset_index(name='total_runs_allowed_home')
away_runs_allowed = IPL2023.groupby('home_team')['away_runs'].sum().reset_index(name='total_runs_allowed_away')

# Merge the two DataFrames to get total runs allowed by each team
total_runs_allowed = pd.merge(home_runs_allowed, away_runs_allowed, how='outer', left_on='away_team', right_on='home_team')

# Fill missing values with 0 and sum total runs allowed
total_runs_allowed['total_runs_allowed'] = total_runs_allowed['total_runs_allowed_home'].fillna(0) + total_runs_allowed['total_runs_allowed_away'].fillna(0)

# Drop unnecessary columns
total_runs_allowed.drop(['total_runs_allowed_home', 'total_runs_allowed_away', 'home_team'], axis=1, inplace=True)

# Rename the 'away_team' column to 'Team' for clarity
total_runs_allowed.rename(columns={'away_team': 'Team'}, inplace=True)

# Merge all the calculated statistics into one DataFrame
team_stats = pd.merge(total_matches, total_wins, how='outer', on='Team')
team_stats = pd.merge(team_stats, total_runs, how='outer', on='Team')
team_stats = pd.merge(team_stats, total_runs_allowed, how='outer', on='Team')

# Fill missing values with 0
team_stats.fillna(0, inplace=True)

# Rename columns for clarity
team_stats.rename(columns={
    'total_matches_played': 'Number of Matches',
    'total_matches_won': 'Matches Won',
    'total_runs_scored': 'Total Runs Scored',
    'total_runs_allowed': 'Total Runs Allowed'
}, inplace=True)


In [87]:
team_stats

Unnamed: 0,Team,Number of Matches,Matches Won,Total Runs Scored,Total Runs Allowed
0,CSK,13,8,2369.0,2232.0
1,DC,14,5,2182.0,2424.0
2,GT,14,10,2450.0,2326.0
3,KKR,14,6,2463.0,2470.0
4,LSG,13,8,2253.0,2216.0
5,MI,14,8,2592.0,2620.0
6,PBKS,14,6,2556.0,2564.0
7,RCB,14,7,2502.0,2435.0
8,RR,14,7,2419.0,2389.0
9,SRH,14,4,2376.0,2486.0


In [88]:
team_stats['wp']= team_stats['Matches Won']/team_stats['Number of Matches']
team_stats['pyth']= team_stats['Total Runs Scored']**2/((team_stats['Total Runs Scored']**2) + (team_stats['Total Runs Allowed']**2))

In [93]:
team_stats

Unnamed: 0,Team,Number of Matches,Matches Won,Total Runs Scored,Total Runs Allowed,wp,pyth
0,CSK,13,8,2369.0,2232.0,0.615385,0.52975
1,DC,14,5,2182.0,2424.0,0.357143,0.447604
2,GT,14,10,2450.0,2326.0,0.714286,0.525946
3,KKR,14,6,2463.0,2470.0,0.428571,0.498581
4,LSG,13,8,2253.0,2216.0,0.615385,0.508279
5,MI,14,8,2592.0,2620.0,0.571429,0.494628
6,PBKS,14,6,2556.0,2564.0,0.428571,0.498438
7,RCB,14,7,2502.0,2435.0,0.5,0.513568
8,RR,14,7,2419.0,2389.0,0.5,0.506239
9,SRH,14,4,2376.0,2486.0,0.285714,0.477387


### Let's Find out OverAchievers and Underachievers Team of IPL 2023

#### Overachievers

In [96]:
OverAchievers = team_stats[team_stats['wp'] > team_stats['pyth']]
OverAchievers['Team']

0    CSK
2     GT
4    LSG
5     MI
Name: Team, dtype: object

#### Underachievers

In [95]:
UnderAchievers = team_stats[team_stats['wp'] < team_stats['pyth']]
UnderAchievers['Team']

1      DC
3     KKR
6    PBKS
7     RCB
8      RR
9     SRH
Name: Team, dtype: object