# **IPL Batting Performance Analysis venue Chepauk against csk team for each phase (PowerPlay, Middle Overs, Death Overs)**  

## **Objective**  
This notebook analyzes batting performances at **MA Chidambaram Stadium, Chepauk**, against **Chennai Super Kings (CSK)** across **three key phases of play**:  

- ⏳ **PowerPlay (Overs 1-6)** – Evaluating early acceleration and strike rotation.  
- ⚖️ **Middle Overs (Overs 7-15)** – Assessing stability, consistency, and boundary hitting.  
- 🔥 **Death Overs (Overs 16-20)** – Measuring finishing ability and explosive hitting.  

The goal is to assess **batting efficiency, boundary-hitting frequency, strike rates, and pressure-handling ability** across these phases.  

---

## **Dataset Overview**  
- **Source**: IPL match data  
- **Filters Applied**:  
  - 📍 **Venue**: **MA Chidambaram Stadium, Chepauk**  
  - 🏏 **Opposition Team**: **Chennai Super Kings (CSK)**  
  - 🕒 **Phases of Play**:  
    - **PowerPlay (Overs 1-6)**  
    - **Middle Overs (Overs 7-15)**  
    - **Death Overs (Overs 16-20)**  

---

## **Methodology**  
1. Extract relevant matches and filter data based on the specified conditions.  
2. Compute key batting metrics like **strike rate, boundary percentage, and dot ball percentage** for each phase.  
3. Compare performances across **PowerPlay, Middle Overs, and Death Overs**.  
4. Identify **players who excel in each phase** and their batting approach.  

In [2]:
import math
import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings('ignore')

pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', None)

import matplotlib.pyplot as plt

In [3]:
deliveries = pd.read_csv('deliveries.csv')
matches = pd.read_csv('matches.csv')

In [4]:
def ByInnings(df,current_innings):
    
    df = df[df.innings == current_innings]
    df.reset_index(inplace = True,drop = True)
    
    df['isDot'] = df['runs_of_bat'].apply(lambda x: 1 if x == 0 else 0)
    df['isOne'] = df['runs_of_bat'].apply(lambda x: 1 if x == 1 else 0)
    df['isTwo'] = df['runs_of_bat'].apply(lambda x: 1 if x == 2 else 0)
    df['isThree'] = df['runs_of_bat'].apply(lambda x: 1 if x == 3 else 0)
    df['isFour'] = df['runs_of_bat'].apply(lambda x: 1 if x == 4 else 0)
    df['isSix'] = df['runs_of_bat'].apply(lambda x: 1 if x == 6 else 0)
    
    runs = pd.DataFrame(df.groupby(['batsman'])['runs_of_bat'].sum()).reset_index().rename(columns={'runs_of_bat':'runs'})
    innings = pd.DataFrame(df.groupby(['batsman'])['match_id'].apply(lambda x: len(list(np.unique(x))))).reset_index().rename(columns={'match_id':'innings'})
    balls = pd.DataFrame(df.groupby(['batsman'])['match_id'].count()).reset_index().rename(columns={'match_id':'balls'})
    dismissals = pd.DataFrame(df.groupby(['batsman'])['player_dismissed'].count()).reset_index().rename(columns={'player_dismissed':'dismissals'})
    sixes = pd.DataFrame(df.groupby(['batsman'])['isSix'].sum()).reset_index().rename(columns = {'isSix':'sixes'})
    fours = pd.DataFrame(df.groupby(['batsman'])['isFour'].sum()).reset_index().rename(columns = {'isFour':'fours'})
    
    df = pd.merge(innings,runs,on = 'batsman').merge( balls, on = 'batsman').merge( dismissals, on = 'batsman').merge( fours, on = 'batsman').merge( sixes, on = 'batsman')
    df['RPI'] =df.apply(lambda x: (x['runs']/x['innings']),axis = 1)
    return df


In [5]:
def balls_per_dismissal(balls, dismissals):
    if dismissals > 0:
        return balls/dismissals
    else:
        return balls/1 
    
def balls_per_boundary(balls, boundaries):
    if boundaries > 0:
        return balls/boundaries
    else:
        return balls/1 

In [6]:
def ByCustom(df, current_venue, current_phase, current_opposition):
    
    df = df[df.venue == current_venue]
    df = df[df.phase == current_phase]
    df = df[df.bowling_team == current_opposition]
   
  
    df.reset_index(inplace = True,drop = True)
    
    df['isDot'] = df['runs_of_bat'].apply(lambda x: 1 if x == 0 else 0)
    df['isOne'] = df['runs_of_bat'].apply(lambda x: 1 if x == 1 else 0)
    df['isTwo'] = df['runs_of_bat'].apply(lambda x: 1 if x == 2 else 0)
    df['isThree'] = df['runs_of_bat'].apply(lambda x: 1 if x == 3 else 0)
    df['isFour'] = df['runs_of_bat'].apply(lambda x: 1 if x == 4 else 0)
    df['isSix'] = df['runs_of_bat'].apply(lambda x: 1 if x == 6 else 0)
    
    runs = pd.DataFrame(df.groupby(['batsman'])['runs_of_bat'].sum()).reset_index().rename(columns={'runs_of_bat':'runs'})
    innings = pd.DataFrame(df.groupby(['batsman'])['match_id'].apply(lambda x: len(list(np.unique(x))))).reset_index().rename(columns={'match_id':'innings'})
    balls = pd.DataFrame(df.groupby(['batsman'])['match_id'].count()).reset_index().rename(columns={'match_id':'balls'})
    dismissals = pd.DataFrame(df.groupby(['batsman'])['player_dismissed'].count()).reset_index().rename(columns={'player_dismissed':'dismissals'})
    sixes = pd.DataFrame(df.groupby(['batsman'])['isSix'].sum()).reset_index().rename(columns = {'isSix':'sixes'})
    fours = pd.DataFrame(df.groupby(['batsman'])['isFour'].sum()).reset_index().rename(columns = {'isFour':'fours'})

    dots = pd.DataFrame(df.groupby(['batsman'])['isDot'].sum()).reset_index().rename(columns = {'isDot':'dots'})

    
    df = pd.merge(innings,runs,on = 'batsman').merge( balls, on = 'batsman').merge( dismissals, on = 'batsman').merge( fours, on = 'batsman').merge( sixes, on = 'batsman').merge( dots, on = 'batsman')
    df['RPI'] = df.apply(lambda x: x['runs']/x['innings'], axis = 1)
    df['SR'] = df.apply(lambda x: 100*(x['runs']/x['balls']),axis = 1)
    df['BPD'] = df.apply(lambda x: balls_per_dismissal(x['balls'], x['dismissals']), axis = 1)
    df['BPB'] = df.apply(lambda x: balls_per_boundary(x['balls'], (x['fours'] + x['sixes'])), axis = 1)

    df['dot_percentage'] = df.apply(lambda x: x['dots']/x['balls'], axis = 1)

    return df

In [7]:
df=deliveries.copy()

In [8]:
mdf=matches.copy()

In [9]:
matches.rename(columns = {'id':'match_id'}, inplace = True)

In [10]:
mdf=matches.copy()

In [11]:
mdf.head(1)

Unnamed: 0,match_id,season,city,date,match_type,player_of_match,venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,target_runs,target_overs,super_over,method,umpire1,umpire2
0,335982,2007/08,Bangalore,2008-04-18,League,BB McCullum,M Chinnaswamy Stadium,Royal Challengers Bangalore,Kolkata Knight Riders,Royal Challengers Bangalore,field,Kolkata Knight Riders,runs,140.0,223.0,20.0,N,,Asad Rauf,RE Koertzen


In [12]:
comb=pd.merge(df,mdf,on = 'match_id',how ='left')

In [13]:
comb = comb.rename(columns={'batter':'batsman'})

In [14]:
comb = comb.rename(columns={'batsman_runs':'runs_of_bat'})

In [15]:
#comb.tail()

In [16]:
def get_phase(over_no):
    if over_no < 6:
        return'PowerPlay'
        
    elif over_no < 15:
        return'Middle Over'
    else:
        return'Death Over'

In [17]:
comb['phase'] = comb['over'].apply(lambda x: get_phase(x))

**Filtering data for matches played at MA Chidambaram Stadium, Chepauk**  
**Considering only the _PowerPlay_ phase**  
**Analyzing performance against _Chennai Super Kings (CSK)_**


In [19]:
df = ByCustom(comb ,"MA Chidambaram Stadium, Chepauk",'PowerPlay','Chennai Super Kings')

In [20]:
df.sort_values(['runs'], ascending = False).reset_index(drop = True).head()

Unnamed: 0,batsman,innings,runs,balls,dismissals,fours,sixes,dots,RPI,SR,BPD,BPB,dot_percentage
0,S Dhawan,5,103,89,2,14,0,34,20.6,115.730337,44.5,6.357143,0.382022
1,RV Uthappa,6,98,71,3,13,4,35,16.333333,138.028169,23.666667,4.176471,0.492958
2,MA Agarwal,5,90,73,4,12,4,40,18.0,123.287671,18.25,4.5625,0.547945
3,MS Bisla,3,89,67,0,14,2,32,29.666667,132.835821,67.0,4.1875,0.477612
4,SR Watson,3,80,62,0,9,2,26,26.666667,129.032258,62.0,5.636364,0.419355


**Filtering data for matches played at MA Chidambaram Stadium, Chepauk**  
**Considering only the Middle Over phase**  
**Analyzing performance against _Chennai Super Kings (CSK)_**


In [22]:
df = ByCustom(comb ,'MA Chidambaram Stadium, Chepauk','Middle Over','Chennai Super Kings')

In [23]:
df.sort_values(['runs'], ascending = False).reset_index(drop = True).head()

Unnamed: 0,batsman,innings,runs,balls,dismissals,fours,sixes,dots,RPI,SR,BPD,BPB,dot_percentage
0,V Kohli,7,175,150,4,11,5,50,25.0,116.666667,37.5,9.375,0.333333
1,JH Kallis,5,124,124,3,10,1,39,24.8,100.0,41.333333,11.272727,0.314516
2,SR Watson,4,121,70,2,9,7,16,30.25,172.857143,35.0,4.375,0.228571
3,SE Marsh,3,104,54,1,8,7,10,34.666667,192.592593,54.0,3.6,0.185185
4,MS Bisla,3,89,66,2,6,4,20,29.666667,134.848485,33.0,6.6,0.30303


**Filtering data for matches played at MA Chidambaram Stadium, Chepauk**  
**Considering only the Death Over phase**  
**Analyzing performance against _Chennai Super Kings (CSK)_**


In [25]:
df = ByCustom(comb ,"MA Chidambaram Stadium, Chepauk",'Death Over','Chennai Super Kings')

In [26]:
df.sort_values(['runs'], ascending = False).reset_index(drop = True).head()

Unnamed: 0,batsman,innings,runs,balls,dismissals,fours,sixes,dots,RPI,SR,BPD,BPB,dot_percentage
0,Harbhajan Singh,3,78,48,2,4,7,19,26.0,162.5,24.0,4.363636,0.395833
1,IK Pathan,4,77,46,2,4,5,13,19.25,167.391304,23.0,5.111111,0.282609
2,AB de Villiers,4,70,40,3,6,2,10,17.5,175.0,13.333333,5.0,0.25
3,LR Shukla,4,60,46,4,5,3,18,15.0,130.434783,11.5,5.75,0.391304
4,CL White,3,53,36,2,4,3,10,17.666667,147.222222,18.0,5.142857,0.277778


In [27]:
#comb['team1'].unique()

In [28]:
#comb['venue'].unique()

In [29]:
wt_sr, wt_rpi, wt_bpd, wt_dot_percentage = 0.13, 0.27, 0.16, 0.45

In [30]:
#step1: square of all values
df['calc_SR'] = df['SR'].apply(lambda x: x*x) 
df['calc_RPI'] = df['RPI'].apply(lambda x: x*x) 
df['calc_BPD'] = df['BPD'].apply(lambda x: x*x) 
df['calc_dot_percentage'] = df['dot_percentage'].apply(lambda x: x*x)

sq_sr, sq_rpi, sq_bpd, sq_dot_percentage = np.sqrt(df[['calc_SR','calc_RPI', 'calc_BPD', 'calc_dot_percentage']].sum(axis = 0))

df['calc_SR'] = df['calc_SR'].apply(lambda x: x/sq_sr) 
df['calc_RPI'] = df['calc_RPI'].apply(lambda x: x/sq_rpi) 
df['calc_BPD'] = df['calc_BPD'].apply(lambda x: x/sq_bpd) 
df['calc_dot_percentage'] = df['calc_dot_percentage'].apply(lambda x: x/sq_dot_percentage)

df['calc_SR'] = df['calc_SR'].apply(lambda x: x*wt_sr) 
df['calc_RPI'] = df['calc_RPI'].apply(lambda x: x*wt_rpi) 
df['calc_BPD'] = df['calc_BPD'].apply(lambda x: x*wt_bpd) 
df['calc_dot_percentage'] = df['calc_dot_percentage'].apply(lambda x: x*wt_dot_percentage)

best_sr, worst_sr = max(df['calc_SR']), min(df['calc_SR'])
best_rpi, worst_rpi = max(df['calc_RPI']), min(df['calc_RPI'])
best_bpd, worst_bpd = max(df['calc_BPD']), min(df['calc_BPD'])
best_dot_percentage, worst_dot_percentage = min(df['calc_dot_percentage']), max(df['calc_dot_percentage'])

In [31]:
df['dev_best_SR'] = df['calc_SR'].apply(lambda x: (x-best_sr)*(x-best_sr)) 
df['dev_best_RPI'] = df['calc_RPI'].apply(lambda x: (x-best_rpi)*(x-best_rpi)) 
df['dev_best_BPD'] = df['calc_BPD'].apply(lambda x: (x-best_bpd)*(x-best_bpd)) 
df['dev_best_dot_percentage'] = df['calc_dot_percentage'].apply(lambda x: (x-best_dot_percentage)*(x-best_dot_percentage))

df['dev_best_sqrt'] = df.apply(lambda x: x['dev_best_SR'] + x['dev_best_RPI'] + x['dev_best_BPD'] + x['dev_best_dot_percentage'], axis = 1) 

df['dev_worst_SR'] = df['calc_SR'].apply(lambda x: (x-worst_sr)*(x-worst_sr)) 
df['dev_worst_RPI'] = df['calc_RPI'].apply(lambda x: (x-worst_rpi)*(x-worst_rpi)) 
df['dev_worst_BPD'] = df['calc_BPD'].apply(lambda x: (x-worst_bpd)*(x-worst_bpd)) 
df['dev_worst_dot_percentage'] = df['calc_dot_percentage'].apply(lambda x: (x-worst_dot_percentage)*(x-worst_dot_percentage))

df['dev_worst_sqrt'] = df.apply(lambda x: x['dev_worst_SR'] + x['dev_worst_RPI'] + x['dev_worst_BPD'] + x['dev_worst_dot_percentage'], axis = 1) 

In [32]:
df['score'] = df.apply(lambda x: x['dev_worst_sqrt']/(x['dev_worst_sqrt'] + x['dev_best_sqrt']), axis = 1)

In [33]:
df[[ 'batsman','innings', 'runs', 'balls', 'dismissals', 'dot_percentage', 'score']].sort_values(['innings'], ascending = False).reset_index(drop = True).head(10)

Unnamed: 0,batsman,innings,runs,balls,dismissals,dot_percentage,score
0,YK Pathan,5,21,20,4,0.25,0.011107
1,LR Shukla,4,60,46,4,0.391304,0.033085
2,IK Pathan,4,77,46,2,0.282609,0.114416
3,AB de Villiers,4,70,40,3,0.25,0.125231
4,PP Chawla,4,25,24,3,0.625,0.010877
5,BJ Hodge,3,42,28,1,0.25,0.07646
6,RG Sharma,3,28,21,3,0.285714,0.032706
7,UT Yadav,3,13,14,2,0.428571,0.006604
8,V Kohli,3,38,24,3,0.333333,0.073633
9,JH Kallis,3,27,15,3,0.266667,0.130598
