<a href="https://colab.research.google.com/github/harshdhamecha/IPL-Data-Analysis/blob/main/Feature-Engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Importing Modules

In [None]:
import pandas as pd
import numpy as np

# Matches Dataset

## Loading the Datasets

In [None]:
matches = pd.read_csv('/content/drive/MyDrive/Study/Projects/IPL-Data-Analysis/Datasets/Raw/IPL-Matches.csv', index_col='id')
matches.head()

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
335982,Bangalore,2008-04-18,BB McCullum,M Chinnaswamy Stadium,0,Royal Challengers Bangalore,Kolkata Knight Riders,Royal Challengers Bangalore,field,Kolkata Knight Riders,runs,140.0,N,,Asad Rauf,RE Koertzen
335983,Chandigarh,2008-04-19,MEK Hussey,"Punjab Cricket Association Stadium, Mohali",0,Kings XI Punjab,Chennai Super Kings,Chennai Super Kings,bat,Chennai Super Kings,runs,33.0,N,,MR Benson,SL Shastri
335984,Delhi,2008-04-19,MF Maharoof,Feroz Shah Kotla,0,Delhi Daredevils,Rajasthan Royals,Rajasthan Royals,bat,Delhi Daredevils,wickets,9.0,N,,Aleem Dar,GA Pratapkumar
335985,Mumbai,2008-04-20,MV Boucher,Wankhede Stadium,0,Mumbai Indians,Royal Challengers Bangalore,Mumbai Indians,bat,Royal Challengers Bangalore,wickets,5.0,N,,SJ Davis,DJ Harper
335986,Kolkata,2008-04-20,DJ Hussey,Eden Gardens,0,Kolkata Knight Riders,Deccan Chargers,Deccan Chargers,bat,Kolkata Knight Riders,wickets,5.0,N,,BF Bowden,K Hariharan


## Data Preprocessing

In [None]:
matches.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 816 entries, 335982 to 1237181
Data columns (total 16 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   city             803 non-null    object 
 1   date             816 non-null    object 
 2   player_of_match  812 non-null    object 
 3   venue            816 non-null    object 
 4   neutral_venue    816 non-null    int64  
 5   team1            816 non-null    object 
 6   team2            816 non-null    object 
 7   toss_winner      816 non-null    object 
 8   toss_decision    816 non-null    object 
 9   winner           812 non-null    object 
 10  result           812 non-null    object 
 11  result_margin    799 non-null    float64
 12  eliminator       812 non-null    object 
 13  method           19 non-null     object 
 14  umpire1          816 non-null    object 
 15  umpire2          816 non-null    object 
dtypes: float64(1), int64(1), object(14)
memory usage: 108

Inference
- As expected, most of the columns like `venue`, `team1`, `team2`, etc. are not having any null values.
- Let us go through all the columns one by one.


### `city`

In [None]:
matches.loc[matches.city.isnull()]

Inference
- If you closely observe the `city` and `venue` column, you would get to know that the venue is either `Dubai International Cricket Stadium` or `Dubai International Cricket Stadium` for missing values of `city`.

We can easily fix this by putting `Dubai` and `Sharjah` as `city` where the `venue` is `Dubai International Cricket Stadium` and `Dubai International Cricket Stadium` respectively.  

We will do that but let us also first see all the unique values of `city` column to check data integrity.

In [None]:
matches.city.unique()

array(['Bangalore', 'Chandigarh', 'Delhi', 'Mumbai', 'Kolkata', 'Jaipur',
       'Hyderabad', 'Chennai', 'Cape Town', 'Port Elizabeth', 'Durban',
       'Centurion', 'East London', 'Johannesburg', 'Kimberley',
       'Bloemfontein', 'Ahmedabad', 'Cuttack', 'Nagpur', 'Dharamsala',
       'Kochi', 'Indore', 'Visakhapatnam', 'Pune', 'Raipur', 'Ranchi',
       'Abu Dhabi', nan, 'Rajkot', 'Kanpur', 'Bengaluru', 'Dubai',
       'Sharjah'], dtype=object)

Inference
- Two different values found for the same city: `Bangalore` and `Bengaluru`.  

Let us fix this and null values scenario as we discussed above. Following is the code for the same.

In [None]:
def manage_city(matches):
    matches['city'].replace('Bengaluru', 'Bangalore', inplace=True)
    for idx in matches[matches['city'].isna()].index:
        matches.loc[idx, 'city'] = 'Sharjah' if matches.loc[idx, 'venue'] == 'Sharjah Cricket Stadium' else 'Dubai'

In [None]:
manage_city(matches)
matches.loc[matches.city.isnull()]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1


In [None]:
matches.city.unique()

We have handled the null values of `city` and replace the city name.  

We can verify the newly inserted `city`'s values by executing the following code

In [None]:
matches.loc[[729287, 729311]]     # id's where city values were missing.

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
729287,Dubai,2014-04-19,PA Patel,Dubai International Cricket Stadium,1,Royal Challengers Bangalore,Mumbai Indians,Royal Challengers Bangalore,field,Royal Challengers Bangalore,wickets,7.0,N,,Aleem Dar,AK Chaudhary
729311,Sharjah,2014-04-27,DR Smith,Sharjah Cricket Stadium,1,Sunrisers Hyderabad,Chennai Super Kings,Sunrisers Hyderabad,bat,Chennai Super Kings,wickets,5.0,N,,AK Chaudhary,VA Kulkarni


### `venue`

Thre's no any missing values in these columns but it requires a few modifications.

In [None]:
matches.venue.unique()

array(['M Chinnaswamy Stadium',
       'Punjab Cricket Association Stadium, Mohali', 'Feroz Shah Kotla',
       'Wankhede Stadium', 'Eden Gardens', 'Sawai Mansingh Stadium',
       'Rajiv Gandhi International Stadium, Uppal',
       'MA Chidambaram Stadium, Chepauk', 'Dr DY Patil Sports Academy',
       'Newlands', "St George's Park", 'Kingsmead', 'SuperSport Park',
       'Buffalo Park', 'New Wanderers Stadium', 'De Beers Diamond Oval',
       'OUTsurance Oval', 'Brabourne Stadium',
       'Sardar Patel Stadium, Motera', 'Barabati Stadium',
       'Vidarbha Cricket Association Stadium, Jamtha',
       'Himachal Pradesh Cricket Association Stadium', 'Nehru Stadium',
       'Holkar Cricket Stadium',
       'Dr. Y.S. Rajasekhara Reddy ACA-VDCA Cricket Stadium',
       'Subrata Roy Sahara Stadium',
       'Shaheed Veer Narayan Singh International Stadium',
       'JSCA International Stadium Complex', 'Sheikh Zayed Stadium',
       'Sharjah Cricket Stadium', 'Dubai International Cricket St

- Few of the stadium names has been changed now. So we need to handle this scenario.
- Two similar values: `M.Chinnaswamy Stadium` and `M Chinnaswamy Stadium`
- Some of the stadium names include city name too. We need to remove the city name from that.

In [None]:
def change_venue(matches, venues):
    matches['venue'].replace(venues, inplace=True)

In [None]:
# Venues which needs to be updated

venues = {'M Chinnaswamy Stadium': 'M.Chinnaswamy Stadium', 'Punjab Cricket Association Stadium, Mohali': 'Punjab Cricket Association IS Bindra Stadium',
        'Punjab Cricket Association IS Bindra Stadium, Mohali': 'Punjab Cricket Association IS Bindra Stadium', 'New Wanderers Stadium': 'The Wanderers Stadium', 
        'Rajiv Gandhi International Stadium, Uppal': 'Rajiv Gandhi International Stadium', 'MA Chidambaram Stadium, Chepauk': 'MA Chidambaram Stadium', 
        'Newlands': 'Newlands Cricket Ground', 'Kingsmead': 'Hollywoodbets Kingsmead Stadium', 'OUTsurance Oval': 'Mangaung Oval', 'Sardar Patel Stadium, Motera': 'Narendra Modi Stadium', 
        'Feroz Shah Kotla': 'Arun Jaitley Stadium', 'Vidarbha Cricket Association Stadium, Jamtha': 'Vidarbha Cricket Association Stadium', 
        'Subrata Roy Sahara Stadium': 'Maharashtra Cricket Association Stadium', 'Green Park': 'Green Park Stadium', 'Nehru Stadium': 'Jawaharlal Nehru Stadium'}

In [None]:
change_venue(matches, venues)
matches.venue.unique()

array(['M.Chinnaswamy Stadium',
       'Punjab Cricket Association IS Bindra Stadium',
       'Arun Jaitley Stadium', 'Wankhede Stadium', 'Eden Gardens',
       'Sawai Mansingh Stadium', 'Rajiv Gandhi International Stadium',
       'MA Chidambaram Stadium', 'Dr DY Patil Sports Academy',
       'Newlands Cricket Ground', "St George's Park",
       'Hollywoodbets Kingsmead Stadium', 'SuperSport Park',
       'Buffalo Park', 'The Wanderers Stadium', 'De Beers Diamond Oval',
       'Mangaung Oval', 'Brabourne Stadium', 'Narendra Modi Stadium',
       'Barabati Stadium', 'Vidarbha Cricket Association Stadium',
       'Himachal Pradesh Cricket Association Stadium',
       'Jawaharlal Nehru Stadium', 'Holkar Cricket Stadium',
       'Dr. Y.S. Rajasekhara Reddy ACA-VDCA Cricket Stadium',
       'Maharashtra Cricket Association Stadium',
       'Shaheed Veer Narayan Singh International Stadium',
       'JSCA International Stadium Complex', 'Sheikh Zayed Stadium',
       'Sharjah Cricket Stadium

It's done now!

### `team1`, `team2`, `toss_winner`, `winner`

In [None]:
matches.team1.unique()

array(['Royal Challengers Bangalore', 'Kings XI Punjab',
       'Delhi Daredevils', 'Mumbai Indians', 'Kolkata Knight Riders',
       'Rajasthan Royals', 'Deccan Chargers', 'Chennai Super Kings',
       'Kochi Tuskers Kerala', 'Pune Warriors', 'Sunrisers Hyderabad',
       'Gujarat Lions', 'Rising Pune Supergiants',
       'Rising Pune Supergiant', 'Delhi Capitals'], dtype=object)

- The official name was `Rising Pune Supergiant`. So we'll replace `Rising Pune Supergiants` by `Rising Pune Supergiant`.
- Earlier known as `Delhi Daredevils` has now become the `Delhi Capitals`. We'll replace `Delhi Daredevils` by `Delhi Capitalas`.
- The official name was `Pune Warriors India` and not `Pune Warriors`. We'll handle this.

In [None]:
def replace_team_names(matches, columns_to_modify):
    matches[columns_to_modify] = matches[columns_to_modify].replace({'Rising Pune Supergiants': 'Rising Pune Supergiant', 'Pune Warriors': 'Pune Warriors India', 
                                                               'Delhi Daredevils': 'Delhi Capitals'})

In [None]:
columns_to_modify = ['team1', 'team2', 'toss_winner', 'winner']
replace_team_names(matches, columns_to_modify)
matches.team1.unique()

array(['Royal Challengers Bangalore', 'Kings XI Punjab', 'Delhi Capitals',
       'Mumbai Indians', 'Kolkata Knight Riders', 'Rajasthan Royals',
       'Deccan Chargers', 'Chennai Super Kings', 'Kochi Tuskers Kerala',
       'Pune Warriors India', 'Sunrisers Hyderabad', 'Gujarat Lions',
       'Rising Pune Supergiant'], dtype=object)

So, we have handled the team names.

### `player_of_match`, `winner`, `result` and `eliminator`

In [None]:
matches.loc[matches.player_of_match.isnull()]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
501265,Delhi,2011-05-21,,Feroz Shah Kotla,0,Delhi Daredevils,Pune Warriors,Delhi Daredevils,bat,,,,,,SS Hazare,RJ Tucker
829763,Bangalore,2015-04-29,,M Chinnaswamy Stadium,0,Royal Challengers Bangalore,Rajasthan Royals,Rajasthan Royals,field,,,,,,JD Cloete,PG Pathak
829813,Bangalore,2015-05-17,,M Chinnaswamy Stadium,0,Royal Challengers Bangalore,Delhi Daredevils,Royal Challengers Bangalore,field,,,,,,HDPK Dharmasena,K Srinivasan
1178424,Bangalore,2019-04-30,,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Rajasthan Royals,Rajasthan Royals,field,,,,,,UV Gandhe,NJ Llong


All these columns are having 4 missing values and in the same record (match). All these 4 matches were cancelled due to rain.  

Thus, we will keep `player_of_match`'s missing values **as it is** and replace null values with `no result` in ohter 3 columns.

In [None]:
def manage_no_result(matches):
    for idx in matches[matches['result'].isna()].index:
        matches.loc[idx, 'winner'] = 'no result'   
        matches.loc[idx, 'result'] = 'no result'
        matches.loc[idx, 'result_margin'] = 'no result'

In [None]:
manage_no_result(matches)
matches.loc[matches.player_of_match.isnull()]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
501265,Delhi,2011-05-21,,Arun Jaitley Stadium,0,Delhi Capitals,Pune Warriors India,Delhi Capitals,bat,no result,no result,no result,,,SS Hazare,RJ Tucker
829763,Bangalore,2015-04-29,,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Rajasthan Royals,Rajasthan Royals,field,no result,no result,no result,,,JD Cloete,PG Pathak
829813,Bangalore,2015-05-17,,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Delhi Capitals,Royal Challengers Bangalore,field,no result,no result,no result,,,HDPK Dharmasena,K Srinivasan
1178424,Bangalore,2019-04-30,,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Rajasthan Royals,Rajasthan Royals,field,no result,no result,no result,,,UV Gandhe,NJ Llong


### `method`

In [None]:
matches.loc[-matches.method.isnull()]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
336022,Delhi,2008-05-17,DPMD Jayawardene,Arun Jaitley Stadium,0,Delhi Capitals,Kings XI Punjab,Delhi Capitals,bat,Kings XI Punjab,runs,6,N,D/L,AV Jayaprakash,RE Koertzen
336025,Kolkata,2008-05-18,M Ntini,Eden Gardens,0,Kolkata Knight Riders,Chennai Super Kings,Kolkata Knight Riders,bat,Chennai Super Kings,runs,3,N,D/L,Asad Rauf,K Hariharan
392183,Cape Town,2009-04-19,DL Vettori,Newlands Cricket Ground,1,Delhi Capitals,Kings XI Punjab,Delhi Capitals,field,Delhi Capitals,wickets,10,N,D/L,MR Benson,SD Ranade
392186,Durban,2009-04-21,CH Gayle,Hollywoodbets Kingsmead Stadium,1,Kings XI Punjab,Kolkata Knight Riders,Kolkata Knight Riders,field,Kolkata Knight Riders,runs,11,N,D/L,DJ Harper,SD Ranade
392214,Centurion,2009-05-07,ML Hayden,SuperSport Park,1,Chennai Super Kings,Kings XI Punjab,Chennai Super Kings,bat,Chennai Super Kings,runs,12,N,D/L,DJ Harper,TH Wijewardene
501215,Kochi,2011-04-18,BB McCullum,Jawaharlal Nehru Stadium,0,Kochi Tuskers Kerala,Chennai Super Kings,Kochi Tuskers Kerala,field,Kochi Tuskers Kerala,wickets,7,N,D/L,K Hariharan,AL Hill
501245,Kolkata,2011-05-07,Iqbal Abdulla,Eden Gardens,0,Kolkata Knight Riders,Chennai Super Kings,Chennai Super Kings,bat,Kolkata Knight Riders,runs,10,N,D/L,Asad Rauf,PR Reiffel
501255,Bangalore,2011-05-14,CH Gayle,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Kolkata Knight Riders,Royal Challengers Bangalore,field,Royal Challengers Bangalore,wickets,4,N,D/L,RE Koertzen,RB Tiffin
733993,Delhi,2014-05-10,DW Steyn,Arun Jaitley Stadium,0,Delhi Capitals,Sunrisers Hyderabad,Sunrisers Hyderabad,field,Sunrisers Hyderabad,wickets,8,N,D/L,RM Deshpande,BNJ Oxenford
829743,Visakhapatnam,2015-04-22,DA Warner,Dr. Y.S. Rajasekhara Reddy ACA-VDCA Cricket St...,0,Sunrisers Hyderabad,Kolkata Knight Riders,Kolkata Knight Riders,field,Sunrisers Hyderabad,runs,16,N,D/L,RK Illingworth,VA Kulkarni


Inference
- The column `method` has only 19 non-null values, i.e. `D/L`. It denotes the matches where D/L method is applied. 

We do not need to apply any preprocessing technique for this column.

### `neutral_venue` and Feature Engineering

Now comes the **most important** part of this notebook.

In [None]:
matches.loc[matches.date.str.startswith('2020')]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
1216492,Abu Dhabi,2020-09-19,AT Rayudu,Sheikh Zayed Stadium,0,Mumbai Indians,Chennai Super Kings,Chennai Super Kings,field,Chennai Super Kings,wickets,5.0,N,,CB Gaffaney,VK Sharma
1216493,Dubai,2020-09-20,MP Stoinis,Dubai International Cricket Stadium,0,Delhi Capitals,Kings XI Punjab,Kings XI Punjab,field,Delhi Capitals,tie,,Y,,AK Chaudhary,Nitin Menon
1216494,Abu Dhabi,2020-10-21,Mohammed Siraj,Sheikh Zayed Stadium,0,Kolkata Knight Riders,Royal Challengers Bangalore,Kolkata Knight Riders,bat,Royal Challengers Bangalore,wickets,8.0,N,,VK Sharma,S Ravi
1216495,Sharjah,2020-11-03,S Nadeem,Sharjah Cricket Stadium,0,Mumbai Indians,Sunrisers Hyderabad,Sunrisers Hyderabad,field,Sunrisers Hyderabad,wickets,10.0,N,,C Shamshuddin,RK Illingworth
1216496,Sharjah,2020-09-22,SV Samson,Sharjah Cricket Stadium,0,Rajasthan Royals,Chennai Super Kings,Chennai Super Kings,field,Rajasthan Royals,runs,16.0,N,,C Shamshuddin,VA Kulkarni
1216497,Abu Dhabi,2020-10-24,CV Varun,Sheikh Zayed Stadium,0,Kolkata Knight Riders,Delhi Capitals,Delhi Capitals,field,Kolkata Knight Riders,runs,59.0,N,,CB Gaffaney,PG Pathak
1216498,Dubai,2020-10-24,CJ Jordan,Dubai International Cricket Stadium,0,Kings XI Punjab,Sunrisers Hyderabad,Sunrisers Hyderabad,field,Kings XI Punjab,runs,12.0,N,,AY Dandekar,PR Reiffel
1216499,Abu Dhabi,2020-10-28,SA Yadav,Sheikh Zayed Stadium,0,Royal Challengers Bangalore,Mumbai Indians,Mumbai Indians,field,Mumbai Indians,wickets,5.0,N,,UV Gandhe,CB Gaffaney
1216500,Sharjah,2020-10-09,R Ashwin,Sharjah Cricket Stadium,0,Delhi Capitals,Rajasthan Royals,Rajasthan Royals,field,Delhi Capitals,runs,46.0,N,,KN Ananthapadmanabhan,C Shamshuddin
1216501,Abu Dhabi,2020-10-07,RA Tripathi,Sheikh Zayed Stadium,0,Kolkata Knight Riders,Chennai Super Kings,Kolkata Knight Riders,bat,Kolkata Knight Riders,runs,10.0,N,,KN Ananthapadmanabhan,RK Illingworth


For all the matches of 2020 season, `neutral_venue==0` which is not true as the season was scheduled in the UAE. We will change this to 1.

In [None]:
# Manually gone through different seasons of IPL and found out the following ids represent the playoff matches.

playoff_ids = [336038, 336039, 336040, 392237, 392238, 392239, 419162, 419163, 419165, 501268, 501269, 501270, 501271, 548378, 548379, 548380, 548381, 598070, 598071, 598072, 
               598073, 734043, 734045, 734047, 734049, 829817, 829819, 829821, 829823, 981013, 981015, 981017, 981019, 1082647, 1082648, 1082649, 1082650, 1136617, 
               1136618, 1136619, 1136620, 1181764, 1181766, 1181767, 1181768, 1237177, 1237178, 1237180, 1237181]

In [None]:
matches.loc[playoff_ids]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
336038,Mumbai,2008-05-30,SR Watson,Wankhede Stadium,0,Delhi Capitals,Rajasthan Royals,Delhi Capitals,field,Rajasthan Royals,runs,105,N,,BF Bowden,RE Koertzen
336039,Mumbai,2008-05-31,M Ntini,Wankhede Stadium,0,Chennai Super Kings,Kings XI Punjab,Kings XI Punjab,bat,Chennai Super Kings,wickets,9,N,,Asad Rauf,DJ Harper
336040,Mumbai,2008-06-01,YK Pathan,Dr DY Patil Sports Academy,0,Chennai Super Kings,Rajasthan Royals,Rajasthan Royals,field,Rajasthan Royals,wickets,3,N,,BF Bowden,RE Koertzen
392237,Centurion,2009-05-22,AC Gilchrist,SuperSport Park,1,Delhi Capitals,Deccan Chargers,Deccan Chargers,field,Deccan Chargers,wickets,6,N,,BR Doctrove,DJ Harper
392238,Johannesburg,2009-05-23,MK Pandey,The Wanderers Stadium,1,Royal Challengers Bangalore,Chennai Super Kings,Royal Challengers Bangalore,field,Royal Challengers Bangalore,wickets,6,N,,RE Koertzen,SJA Taufel
392239,Johannesburg,2009-05-24,A Kumble,The Wanderers Stadium,1,Royal Challengers Bangalore,Deccan Chargers,Royal Challengers Bangalore,field,Deccan Chargers,runs,6,N,,RE Koertzen,SJA Taufel
419162,Mumbai,2010-04-21,KA Pollard,Dr DY Patil Sports Academy,0,Royal Challengers Bangalore,Mumbai Indians,Mumbai Indians,bat,Mumbai Indians,runs,35,N,,BR Doctrove,RB Tiffin
419163,Mumbai,2010-04-22,DE Bollinger,Dr DY Patil Sports Academy,0,Chennai Super Kings,Deccan Chargers,Chennai Super Kings,bat,Chennai Super Kings,runs,38,N,,BR Doctrove,RB Tiffin
419165,Mumbai,2010-04-25,SK Raina,Dr DY Patil Sports Academy,0,Chennai Super Kings,Mumbai Indians,Chennai Super Kings,bat,Chennai Super Kings,runs,22,N,,RE Koertzen,SJA Taufel
501268,Mumbai,2011-05-24,SK Raina,Wankhede Stadium,0,Royal Challengers Bangalore,Chennai Super Kings,Chennai Super Kings,field,Chennai Super Kings,wickets,6,N,,Asad Rauf,SJA Taufel


**For all the playoff matches except the 2nd season(2009), `neutral_venue==0` in our Datasets which is the biggest mistake in the dataset**. We'll handle this soon.   

Now, refer [this wikipedia page](https://en.wikipedia.org/wiki/List_of_Indian_Premier_League_venues) to get a list of home venues yearwise. Following variables will help us for preprocessing and feature engineering.

In [None]:
# All the following variables are derived manually.

# ids of all qualifier matches
qualifier_ids = [336038, 336039, 392237, 392238, 419162, 419163, 501268, 501270, 548378, 548380, 598070, 598072, 734043, 734047, 
                 829817, 829821, 981013, 981017, 1082647, 1082649, 1136617, 1136619, 1181764, 1181767, 1237177, 1237180]

# ids of all eliminator matches
eliminator_ids = [501269, 548379, 598071, 734045, 829819, 981015, 1082648, 1136618, 1181766, 1237178]

# ids of all final matches
final_ids = [336040, 392239, 419165, 501271, 548381, 598073, 734049, 829823, 981019, 1082650, 1136620, 1181768, 1237181]

# ids of all playoff matches when matches played in team2's home ground
team2_home_ids = [419163, 548380, 548381, 734043, 829817, 981013, 1181764]

# ids of all playoff matches when matches played at neutral venues 
change_neutral_venue_ids = [336038, 336039, 336040, 419162, 419165, 501268, 501270, 548378, 548379, 598070, 598071, 598072, 598073, 734045, 
                    734047, 734049, 829819, 829821, 829823, 981015, 981017, 1082648, 1082649, 1082650, 1136617, 1136620, 1181766, 1181767, 1181768]

As the value of `neutral_venue` is 0 in all the playoff matches, we need to change that to 1. we will use the `change_neutral_venue_ids` list to update the values.  
Let's do this and also replace the `neutral_venue`'s values in 2020 matches by 1 as discussed above.

In [None]:
def change_neutral_venue(matches, change_neutral_venue_ids):
    for idx in matches.index:
        if idx in change_neutral_venue_ids:
            matches.loc[idx, 'neutral_venue'] = 1 
        if matches.loc[idx, 'date'][0:4] == '2020':
            matches.loc[idx, 'neutral_venue'] = 1

In [None]:
change_neutral_venue(matches, change_neutral_venue_ids)
matches.loc[playoff_ids]

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
336038,Mumbai,2008-05-30,SR Watson,Wankhede Stadium,1,Delhi Capitals,Rajasthan Royals,Delhi Capitals,field,Rajasthan Royals,runs,105,N,,BF Bowden,RE Koertzen
336039,Mumbai,2008-05-31,M Ntini,Wankhede Stadium,1,Chennai Super Kings,Kings XI Punjab,Kings XI Punjab,bat,Chennai Super Kings,wickets,9,N,,Asad Rauf,DJ Harper
336040,Mumbai,2008-06-01,YK Pathan,Dr DY Patil Sports Academy,1,Chennai Super Kings,Rajasthan Royals,Rajasthan Royals,field,Rajasthan Royals,wickets,3,N,,BF Bowden,RE Koertzen
392237,Centurion,2009-05-22,AC Gilchrist,SuperSport Park,1,Delhi Capitals,Deccan Chargers,Deccan Chargers,field,Deccan Chargers,wickets,6,N,,BR Doctrove,DJ Harper
392238,Johannesburg,2009-05-23,MK Pandey,The Wanderers Stadium,1,Royal Challengers Bangalore,Chennai Super Kings,Royal Challengers Bangalore,field,Royal Challengers Bangalore,wickets,6,N,,RE Koertzen,SJA Taufel
392239,Johannesburg,2009-05-24,A Kumble,The Wanderers Stadium,1,Royal Challengers Bangalore,Deccan Chargers,Royal Challengers Bangalore,field,Deccan Chargers,runs,6,N,,RE Koertzen,SJA Taufel
419162,Mumbai,2010-04-21,KA Pollard,Dr DY Patil Sports Academy,1,Royal Challengers Bangalore,Mumbai Indians,Mumbai Indians,bat,Mumbai Indians,runs,35,N,,BR Doctrove,RB Tiffin
419163,Mumbai,2010-04-22,DE Bollinger,Dr DY Patil Sports Academy,0,Chennai Super Kings,Deccan Chargers,Chennai Super Kings,bat,Chennai Super Kings,runs,38,N,,BR Doctrove,RB Tiffin
419165,Mumbai,2010-04-25,SK Raina,Dr DY Patil Sports Academy,1,Chennai Super Kings,Mumbai Indians,Chennai Super Kings,bat,Chennai Super Kings,runs,22,N,,RE Koertzen,SJA Taufel
501268,Mumbai,2011-05-24,SK Raina,Wankhede Stadium,1,Royal Challengers Bangalore,Chennai Super Kings,Chennai Super Kings,field,Chennai Super Kings,wickets,6,N,,Asad Rauf,SJA Taufel


Now, you can see that the `neutral_venue`'s values are perfectly matching with the team. We have solved the biggest problem of the dataset. 


Players' and Teams' performance varies in Playoff Matches as compared to the League Stage Matches. Thus it becomes really necessary to analyze it. So, we are going to define some new features which will help us doing the same.

In [None]:
def add_few_columns(eliminator_ids, qualifier_ids, final_ids, playoff_ids, team2_home_ids):
    matches['is_playoff'] = [1 if idx in playoff_ids else 0 for idx in matches.index]
    matches['is_eliminator'] = [1 if idx in eliminator_ids else 0 for idx in matches.index]
    matches['is_qualifier'] = [1 if idx in qualifier_ids else 0 for idx in matches.index]
    matches['is_final'] = [1 if idx in final_ids else 0 for idx in matches.index]
    for idx in matches.index:
        if matches.loc[idx, 'neutral_venue'] == 0:
            # Since team names has been arranged in the order (For example, MI in team1 column and CSK in team2 column if the match is schedule in MI's home) 
            # except the playoff matches. We only need to take care of the playoff matches. 
            # We will use the variable team2_home_ids which we have defined earlier to do the same.

            matches.loc[idx, 'home_of'] = matches.loc[idx, 'team2'] if idx in team2_home_ids else matches.loc[idx, 'team1']
            # Assign the value of team2 column to the newly created column home_of if the id of particular record belongs to team2_home_ids and 
            #            value of team1 column  "  "    "     "        "      "    otherwise

In [None]:
add_few_columns(eliminator_ids, qualifier_ids, final_ids, playoff_ids, team2_home_ids)
matches.head(10)

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,eliminator,method,umpire1,umpire2,is_playoff,is_eliminator,is_qualifier,is_final,home_of
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
335982,Bangalore,2008-04-18,BB McCullum,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Kolkata Knight Riders,Royal Challengers Bangalore,field,Kolkata Knight Riders,runs,140,N,,Asad Rauf,RE Koertzen,0,0,0,0,Royal Challengers Bangalore
335983,Chandigarh,2008-04-19,MEK Hussey,Punjab Cricket Association IS Bindra Stadium,0,Kings XI Punjab,Chennai Super Kings,Chennai Super Kings,bat,Chennai Super Kings,runs,33,N,,MR Benson,SL Shastri,0,0,0,0,Kings XI Punjab
335984,Delhi,2008-04-19,MF Maharoof,Arun Jaitley Stadium,0,Delhi Capitals,Rajasthan Royals,Rajasthan Royals,bat,Delhi Capitals,wickets,9,N,,Aleem Dar,GA Pratapkumar,0,0,0,0,Delhi Capitals
335985,Mumbai,2008-04-20,MV Boucher,Wankhede Stadium,0,Mumbai Indians,Royal Challengers Bangalore,Mumbai Indians,bat,Royal Challengers Bangalore,wickets,5,N,,SJ Davis,DJ Harper,0,0,0,0,Mumbai Indians
335986,Kolkata,2008-04-20,DJ Hussey,Eden Gardens,0,Kolkata Knight Riders,Deccan Chargers,Deccan Chargers,bat,Kolkata Knight Riders,wickets,5,N,,BF Bowden,K Hariharan,0,0,0,0,Kolkata Knight Riders
335987,Jaipur,2008-04-21,SR Watson,Sawai Mansingh Stadium,0,Rajasthan Royals,Kings XI Punjab,Kings XI Punjab,bat,Rajasthan Royals,wickets,6,N,,Aleem Dar,RB Tiffin,0,0,0,0,Rajasthan Royals
335988,Hyderabad,2008-04-22,V Sehwag,Rajiv Gandhi International Stadium,0,Deccan Chargers,Delhi Capitals,Deccan Chargers,bat,Delhi Capitals,wickets,9,N,,IL Howell,AM Saheba,0,0,0,0,Deccan Chargers
335989,Chennai,2008-04-23,ML Hayden,MA Chidambaram Stadium,0,Chennai Super Kings,Mumbai Indians,Mumbai Indians,field,Chennai Super Kings,runs,6,N,,DJ Harper,GA Pratapkumar,0,0,0,0,Chennai Super Kings
335990,Hyderabad,2008-04-24,YK Pathan,Rajiv Gandhi International Stadium,0,Deccan Chargers,Rajasthan Royals,Rajasthan Royals,field,Rajasthan Royals,wickets,3,N,,Asad Rauf,MR Benson,0,0,0,0,Deccan Chargers
335991,Chandigarh,2008-04-25,KC Sangakkara,Punjab Cricket Association IS Bindra Stadium,0,Kings XI Punjab,Mumbai Indians,Mumbai Indians,field,Kings XI Punjab,runs,66,N,,Aleem Dar,AM Saheba,0,0,0,0,Kings XI Punjab


Now, We do not required the column `eliminitaor`. Thus, let us drop that.

In [None]:
matches.drop('eliminator', axis=1, inplace=True)
matches.columns

Index(['city', 'date', 'player_of_match', 'venue', 'neutral_venue', 'team1',
       'team2', 'toss_winner', 'toss_decision', 'winner', 'result',
       'result_margin', 'method', 'umpire1', 'umpire2', 'is_playoff',
       'is_eliminator', 'is_qualifier', 'is_final', 'home_of'],
      dtype='object')

The players' and umpires' name format is not readable and interpretable. Names like `YK Pathan`, `KC Sangakkara`, `SR Watson`, etc. are hard to interpret. 
Thus, let us make it more interpretable. 

We need to define some variables for the same. I have used [this gist](https://gist.github.com/harshdhamecha/19100ffedaf096b8d779e4daa5e8c0e7) to define the following variables.

In [None]:
umpire_names = {'Asad Rauf': 'Asad Rauf', 'MR Benson': 'Mark Benson', 'Aleem Dar': 'Aleem Dar', 'SJ Davis': 'Steve Davis', 'BF Bowden': 'Billy Bowden', 
                         'IL Howell': 'Ian Howell', 'DJ Harper': 'Daryl Harper', 'RE Koertzen': 'Rudi Koertzen', 'BR Doctrove': 'Billy Doctrove', 'AV Jayaprakash': 'Arani Jayaprakash', 
                         'BG Jerling': 'Brian Jerling', 'M Erasmus': 'Marais Erasmus', 'HDPK Dharmasena': 'Kumar Dharmasena', 'S Asnani': 'Sudhir Asnani', 
                         'GAV Baxter': 'Gary Baxter', 'SS Hazare': 'Sanjay Hazare', 'K Hariharan': 'Krishna Hariharan', 'SL Shastri': 'Suresh Shastri', 'SK Tarapore': 'Shavir Tarapore', 
                         'S Ravi': 'Sundaram Ravi', 'SJA Taufel': 'Simon Taufel', 'S Das': 'Subrat Das', 'PR Reiffel': 'Paul Reiffel', 'JD Cloete': 'Johan Cloete', 'AK Chaudhary': 'Anil Chaudhary', 
                         'VA Kulkarni': 'Vineet Kulkarni', 'BNJ Oxenford': 'Bruce Oxenford', 'CK Nandan': 'C. K. Nandan', 'C Shamshuddin': 'Chettithody Shamshuddin', 'NJ Llong': 'Nigel Llong', 
                         'RK Illingworth': 'Richard Illingworth', 'K Srinath': 'Krishnaraj Srinath', 'SD Fry': 'Simon Fry', 'CB Gaffaney': 'Chris Gaffaney', 'PG Pathak': 'Pashchim Pathak', 'Nitin Menon': 'Nitin Menon', 
                         'AY Dandekar': 'Anil Dandekar', 'KN Ananthapadmanabhan': 'K. N. Ananthapadmanabhan', 'A Nand Kishore': 'Nand Kishore', 'A Deshmukh': 'Abhijit Deshmukh', 'RJ Tucker': 'Rod Tucker', 'VK Sharma': 'Virender Sharma', 
                         'IJ Gould': 'Ian Gould', 'GA Pratapkumar': 'G. A. Pratapkumar', 'RB Tiffin': 'Russell Tiffin', 'I Shivram': 'Ivaturi Shivram', 'TH Wijewardene': 'Tyron Wijewardene', 'AL Hill': 'Tony Hill', 'Subroto Das': 'Subroto Das', 
                         'K Srinivasan': 'Krishnamachari Srinivasan', 'AM Saheba': 'Amiesh Saheba', 'RM Deshpande': 'Rajesh Deshpande', 'K Bharatan': 'Krishnamachari Bharatan', 'YC Barde': 'Yeshwant Barde', 
                         'UV Gandhe': 'Ulhas Gandhe', 'SD Ranade': 'Shashank Ranade'}

In [None]:
player_names = {'TM Head': 'Travis Head', 'DT Christian': 'Daniel Christian', 'DS Kulkarni': 'Dhawal Kulkarni', 'CJ Anderson': 'Corey Anderson', 'AR Bawne': 'Ankit Bawne', 'SP Jackson': 'Sheldon Jackson', 'MJ Guptill': 'Martin Guptill', 'Anureet Singh': 'Anureet Singh', 'DJ Thornely': 'Dominic Thornely', 'PR Shah': 'Pinal Shah', 'AM Nayar': 'Abhishek Nayar', 'SM Pollock': 'Shaun Pollock', 'AS Yadav': 'Arjun Yadav', 'MA Khote': 'Musavir Khote', 'DA Warner': 'David Warner', 'S Dhawan': 'Shikhar Dhawan', 'MC Henriques': 'Moises Henriques', 'Yuvraj Singh': 'Yuvraj Singh', 'DJ Hooda': 'Deepak Hooda', 'BCJ Cutting': 'Ben Cutting', 'CH Gayle': 'Chris Gayle', 'Mandeep Singh': 'Mandeep Singh', 'KM Jadhav': 'Kedar Jadhav', 'SR Watson': 'Shane Watson', 'Sachin Baby': 'Sachin Baby', 'STR Binny': 'Stuart Binny', 'S Aravind': 'Sreenath Aravind', 'YS Chahal': 'Yuzvendra Chahal', 'TS Mills': 'Tymal Mills', 'A Choudhary': 'Aniket Choudhary', 'PA Patel': 'Parthiv Patel', 'JC Buttler': 'Jos Buttler', 'RG Sharma': 'Rohit Sharma', 'N Rana': 'Nitish Rana', 'AT Rayudu': 'Ambati Rayudu', 'KH Pandya': 'Krunal Pandya', 'KA Pollard': 'Kieron Pollard', 'HH Pandya': 'Hardik Pandya', 'TG Southee': 'Tim Southee', 'AM Rahane': 'Ajinkya Rahane', 'MA Agarwal': 'Mayank Agarwal', 'SPD Smith': 'Steve Smith', 'BA Stokes': 'Ben Stokes', 
               'MS Dhoni': 'Mahendra Singh Dhoni', 'JJ Roy': 'Jason Roy', 'BB McCullum': 'Brendon McCullum', 'SK Raina': 'Suresh Raina', 'AJ Finch': 'Aaron Finch', 'KD Karthik': 'Dinesh Karthik', 'G Gambhir': 'Gautam Gambhir', 'CA Lynn': 'Chris Lynn', 'MK Tiwary': 'Manoj Tiwary', 'HM Amla': 'Hashim Amla', 'M Vohra': 'Manan Vohra', 'WP Saha': 'Wriddhiman Saha', 'AR Patel': 'Axar Patel', 'GJ Maxwell': 'Glenn Maxwell', 'DA Miller': 'David Miller', 'Vishnu Vinod': 'Vishnu Vinod', 'Iqbal Abdulla': 'Iqbal Abdulla', 'P Negi': 'Pawan Negi', 'AP Tare': 'Aditya Tare', 'SW Billings': 'Sam Billings', 'KK Nair': 'Karun Nair', 'SV Samson': 'Sanju Samson', 'RR Pant': 'Rishabh Pant', 'CH Morris': 'Chris Morris', 'CR Brathwaite': 'Carlos Brathwaite', 'PJ Cummins': 'Pat Cummins', 'A Mishra': 'Amit Mishra', 'S Nadeem': 'Shahbaz Nadeem', 'Z Khan': 'Zaheer Khan', 'DR Smith': 'Dwayne Smith', 'P Kumar': 'Praveen Kumar', 'Basil Thampi': 'Basil Thampi', 'RV Uthappa': 'Robin Uthappa', 'MK Pandey': 'Manish Pandey', 'YK Pathan': 'Yusuf Pathan', 'SA Yadav': 'Suryakumar Yadav', 'CR Woakes': 'Chris Woakes', 'SP Narine': 'Sunil Narine', 'Harbhajan Singh': 'Harbhajan Singh', 'AB de Villiers': 'AB de Villiers', 'F du Plessis': 'Faf du Plessis', 'RA Tripathi': 'Rahul Tripathi', 'R Bhatia': 'Rajat Bhatia', 'DL Chahar': 'Deepak Chahar', 'A Zampa': 'Adam Zampa', 
               'AB Dinda': 'Ashoke Dinda', 'Imran Tahir': 'Imran Tahir', 'NV Ojha': 'Naman Ojha', 'V Shankar': 'Vijay Shankar', 'Rashid Khan': 'Rashid Khan', 'B Kumar': 'Bhuvneshwar Kumar', 'MP Stoinis': 'Marcus Stoinis', 'MM Sharma': 'Mohit Sharma', 'VR Aaron': 'Varun Aaron', 'V Kohli': 'Virat Kohli', 'MJ McClenaghan': 'Mitchell McClenaghan', 'Ankit Sharma': 'Ankit Sharma', 'SN Thakur': 'Shardul Thakur', 'RD Chahar': 'Rahul Chahar', 'LH Ferguson': 'Lockie Ferguson', 'C de Grandhomme': 'Colin de Grandhomme', 'Bipul Sharma': 'Bipul Sharma', 'SS Iyer': 'Shreyas Iyer', 'EJG Morgan': 'Eoin Morgan', 'KC Cariappa': 'KC Cariappa', 'Sandeep Sharma': 'Sandeep Sharma', 'Ishan Kishan': 'Ishan Kishan', 'JD Unadkat': 'Jaydev Unadkat', 'AF Milne': 'Adam Milne', 'S Badree': 'Samuel Badree', 'AD Mathews': 'Angelo Mathews', 'Mohammed Shami': 'Mohammed Shami', 'Mohammad Nabi': 'Mohammad Nabi', 'I Sharma': 'Ishant Sharma', 'RA Jadeja': 'Ravindra Jadeja', 'AJ Tye': 'Andrew Tye', 'KS Williamson': 'Kane Williamson', 'SE Marsh': 'Shaun Marsh', 'Shakib Al Hasan': 'Shakib Al Hasan', 'JP Faulkner': 'James Faulkner', 'MG Johnson': 'Mitchell Johnson', 'K Rabada': 'Kagiso Rabada', 'AD Nath': 'Akshdeep Nath', 'NM Coulter-Nile': 'Nathan Coulter-Nile', 'Kuldeep Yadav': 'Kuldeep Yadav', 'UT Yadav': 'Umesh Yadav', 'Washington Sundar': 'Washington Sundar', 
               'KV Sharma': 'Karn Sharma', 'DM Bravo': 'Darren Bravo', 'IK Pathan': 'Irfan Pathan', 'Ankit Soni': 'Ankit Soni', 'JJ Bumrah': 'Jasprit Bumrah', 'SL Malinga': 'Lasith Malinga', 'PJ Sangwan': 'Pradeep Sangwan', 'S Kaul': 'Siddarth Kaul', 'LMP Simmons': 'Lendl Simmons', 'MN Samuels': 'Marlon Samuels', 'Swapnil Singh': 'Swapnil Singh', 'R Tewatia': 'Rahul Tewatia', 'MM Patel': 'Munaf Patel', 'SS Tiwary': 'Saurabh Tiwary', 'TA Boult': 'Trent Boult', 'CJ Jordan': 'Chris Jordan', 'IR Jaggi': 'Ishank Jaggi', 'PP Chawla': 'Piyush Chawla', 'AS Rajpoot': 'Ankit Rajpoot', 'SC Ganguly': 'Sourav Ganguly', 'RT Ponting': 'Ricky Ponting', 'DJ Hussey': 'David Hussey', 'Mohammad Hafeez': 'Mohammad Hafeez', 'R Dravid': 'Rahul Dravid', 'W Jaffer': 'Wasim Jaffer', 'JH Kallis': 'Jacques Kallis', 'CL White': 'Cameron White', 'MV Boucher': 'Mark Boucher', 'B Akhil': 'Balachandra Akhil', 'AA Noffke': 'Ashley Noffke', 'SB Joshi': 'Sunil Joshi', 'ML Hayden': 'Matthew Hayden', 'MEK Hussey': 'Michael Hussey', 'JDP Oram': 'Jacob Oram', 'S Badrinath': 'Subramaniam Badrinath', 'K Goel': 'Karan Goel', 'JR Hopes': 'James Hopes', 'KC Sangakkara': 'Kumar Sangakkara', 'SM Katich': 'Simon Katich', 'T Kohli': 'Taruwar Kohli', 'M Kaif': 'Mohammad Kaif', 'DS Lehmann': 'Darren Lehmann', 'M Rawat': 'Mahesh Rawat', 'D Salunkhe': 'Dinesh Salunkhe', 'SK Warne': 'Shane Warne', 
               'SK Trivedi': 'Siddharth Trivedi', 'V Sehwag': 'Virender Sehwag', 'L Ronchi': 'Luke Ronchi', 'ST Jayasuriya': 'Sanath Jayasuriya', 'S Chanderpaul': 'Shivnarine Chanderpaul', 'LRPL Taylor': 'Ross Taylor', 'AC Gilchrist': 'Adam Gilchrist', 'Y Venugopal Rao': 'Venugopal Rao', 'VVS Laxman': 'VVS Laxman', 'A Symonds': 'Andrew Symonds', 'SB Styris': 'Scott Styris', 'SB Bangar': 'Sanjay Bangar', 'WPUJC Vaas': 'Chaminda Vaas', 'RP Singh': 'RP Singh', 'LR Shukla': 'Laxmi Shukla', 'DPMD Jayawardene': 'Mahela Jayawardene', 'S Sohal': 'Sunny Sohal', 'B Lee': 'Brett Lee', 'WA Mota': 'Wilkin Mota', 'Kamran Akmal': 'Kamran Akmal', 'Shahid Afridi': 'Shahid Afridi', 'DJ Bravo': 'Dwayne Bravo', 'A Nehra': 'Ashish Nehra', 'GC Smith': 'Graeme Smith', 'Pankaj Singh': 'Pankaj Singh', 'RR Sarwan': 'Ramnaresh Sarwan', 'S Sreesanth': 'S Sreesanth', 'VRV Singh': 'Vikram Singh', 'R Vinay Kumar': 'Vinay Kumar', 'AB Agarkar': 'Ajit Agarkar', 'M Kartik': 'Murali Kartik', 'Shoaib Malik': 'Shoaib Malik', 'MF Maharoof': 'Farveez Maharoof', 'VY Mahesh': 'Yo Mahesh', 'TM Srivastava': 'Tanmay Srivastava', 'B Chipli': 'Bharat Chipli', 'DW Steyn': 'Dale Steyn', 'DB Das': 'Debabrata Das', 'HH Gibbs': 'Herschelle Gibbs', 'DNT Zoysa': 'Nuwan Zoysa', 'D Kalyankrishna': 'Doddapaneni Kalyankrishna', 'SA Asnodkar': 'Swapnil Asnodkar', 'Sohail Tanvir': 'Sohail Tanvir', 
               'Salman Butt': 'Salman Butt', 'BJ Hodge': 'Brad Hodge', 'Umar Gul': 'Umar Gul', 'SP Fleming': 'Stephen Fleming', 'S Vidyut': 'Vidyut Sivaramakrishnan', 'JA Morkel': 'Albie Morkel', 'LPC Silva': 'Chamara Silva', 'DB Ravi Teja': 'Dwaraka Ravi Teja', 'Misbah-ul-Haq': 'Misbah-ul-Haq', 'YV Takawale': 'Yogesh Takawale', 'RR Raje': 'Rohan Raje', 'Mohammad Asif': 'Mohammad Asif', 'GD McGrath': 'Glenn McGrath', 'Joginder Sharma': 'Joginder Sharma', 'MS Gony': 'Manpreet Gony', 'M Muralitharan': 'Muttiah Muralitharan', 'M Ntini': 'Makhaya Ntini', 'DT Patil': 'Devraj Patil ', 'A Kumble': 'Anil Kumble', 'S Anirudha': 'Anirudha Srikkanth', 'CK Kapugedera': 'Chamara Kapugedera', 'A Chopra': 'Aakash Chopra', 'T Taibu': 'Tatenda Taibu', 'J Arunkumar': 'Jagadeesh Arunkumar', 'PP Ojha': 'Pragyan Ojha', 'SP Goswami': 'Shreevats Goswami', 'SR Tendulkar': 'Sachin Tendulkar', 'LA Pomersbach': 'Luke Pomersbach', 'Younis Khan': 'Younis Khan', 'PM Sarvesh Kumar': 'Sarvesh Kumar', 'DP Vijaykumar': 'Paidikalva Vijaykumar', 'Shoaib Akhtar': 'Shoaib Akhtar', 'Abdur Razzak': 'Abdur Razzak', 'H Das': 'Halhadar Das', 'SD Chitnis': 'Siddharth Chitnis', 'CRD Fernando': 'Dilhara Fernando', 'L Balaji': 'Lakshmipathy Balaji', 'A Mukund': 'Abhinav Mukund', 'RR Powar': 'Ramesh Powar', 'JP Duminy': 'JP Duminy', 'A Flintoff': 'Andrew Flintoff', 'T Thushara': 'Thilan Thushara', 'JD Ryder': 'Jesse Ryder', 
               'KP Pietersen': 'Kevin Pietersen', 'T Henderson': 'Tyron Henderson', 'Kamran Khan': 'Kamran Khan', 'RS Bopara': 'Ravi Bopara', 'R Bishnoi': 'Rajesh Bishnoi', 'FH Edwards': 'Fidel Edwards', 'PC Valthaty': 'Paul Valthaty', 'RJ Quiney': 'Rob Quiney', 'AS Raut': 'Abhishek Raut', 'Yashpal Singh': 'Yashpal Singh', 'M Manhas': 'Mithun Manhas', 'AN Ghosh': 'Arindam Ghosh', 'BAW Mendis': 'Ajantha Mendis', 'DL Vettori': 'Daniel Vettori', 'MN van Wyk': 'Morné van Wyk', 'RE van der Merwe': 'Roelof van der Merwe', 'TL Suman': 'Tirumalasetti Suman', 'Shoaib Ahmed': 'Shoaib Ahmed', 'GR Napier': 'Graham Napier', 'KP Appanna': 'K P Appanna', 'LA Carseldine': 'Lee Carseldine', 'SM Harwood': 'Shane Harwood', 'M Vijay': 'Murali Vijay', 'SB Jakati': 'Shadab Jakati', 'RJ Harris': 'Ryan Harris', 'D du Preez': 'Dillon du Preez', 'M Morkel': 'Morne Morkel', 'J Botha': 'Johan Botha', 'C Nanda': 'Chetanya Nanda', 'Mashrafe Mortaza': 'Mashrafe Mortaza', 'A Singh': 'Anureet Singh', 'GJ Bailey': 'George Bailey', 'AB McDonald': 'Andrew McDonald', 'Y Nagar': 'Yogesh Nagar', 'SS Shaikh': 'Shoaib Shaikh', 'R Ashwin': 'Ravichandran Ashwin', 'Mohammad Ashraful': 'Mohammad Ashraful', 'CA Pujara': 'Cheteshwar Pujara', 'OA Shah': 'Owais Shah', 'Anirudh Singh': 'Anirudh Singh', 'Jaskaran Singh': 'Jaskaran Singh', 'R Sathish': 'Rajagopal Sathish', 'R McLaren': 'Ryan McLaren', 'AA Jhunjhunwala': 'Abhishek Jhunjhunwala', 
               'P Dogra': 'Paras Dogra', 'A Uniyal': 'Amit Uniyal', 'MS Bisla': 'Manvinder Bisla', 'YA Abdulla': 'Yusuf Abdulla', 'JM Kemp': 'Justin Kemp', 'S Tyagi': 'Sudeep Tyagi', 'RS Gavaskar': 'Rohan Gavaskar', 'SE Bond': 'Shane Bond', 'S Ladda': 'Sarabjit Ladda', 'DP Nannes': 'Dirk Nannes', 'MJ Lumb': 'Michael Lumb', 'DR Martyn': 'Damien Martyn', 'S Narwal': 'Sumit Narwal', 'AB Barath': 'Adrian Barath', 'FY Fazal': 'Faiz Fazal', 'AC Voges': 'Adam Voges', 'MD Mishra': 'Mohnish Mishra', 'KB Arun Karthik': 'Arun Karthik', 'KAJ Roach': 'Kemar Roach', 'PD Collingwood': 'Paul Collingwood', 'CK Langeveldt': 'Charl Langeveldt', 'VS Malik': 'Vikramjeet Malik', 'A Mithun': 'Abhimanyu Mithun', 'AN Ahmed': 'Abu Nechim', 'RS Sodhi': 'Reetinder Singh Sodhi', 'DE Bollinger': 'Doug Bollinger', 'S Sriram': 'Sridharan Sriram', 'B Sumanth': 'Bodapati Sumanth', 'C Madan': 'Chandan Madan', 'AG Paunikar': 'Amit Paunikar', 'MR Marsh': 'Mitchell Marsh', 'Harmeet Singh': 'Harmeet Singh', 'RV Gomez': 'Raiphi Gomez', 'AUK Pathan': 'Asad Pathan', 'UBT Chand': 'Unmukt Chand', 'DJ Jacobs': 'Davy Jacobs', 'Sunny Singh': 'Sunny Singh', 'NJ Rimmington': 'Nathan Rimmington', 'AL Menaria': 'Ashok Menaria', 'WD Parnell': 'Wayne Parnell', 'JJ van der Wath': 'Johan van der Wath', 'R Ninan': 'Ryan Ninan', 'MS Wade': 'Matthew Wade', 'TD Paine': 'Tim Paine', 'SB Wagh': 'Shrikant Wagh', 'AC Thomas': 'Alfonso Thomas', 
               'JEC Franklin': 'James Franklin', 'DH Yagnik': 'Dishant Yagnik', 'S Randiv': 'Suraj Randiv', 'BJ Haddin': 'Brad Haddin', 'NLTC Perera': 'Thisara Perera', 'NL McCullum': 'Nathan McCullum', 'JE Taylor': 'Jerome Taylor', 'J Syed Mohammad': 'Syed Mohammad', 'RN ten Doeschate': 'Ryan ten Doeschate', 'TR Birt': 'Travis Birt', 'Harpreet Singh': 'Harpreet Singh Bhatia', 'M Klinger': 'Michael Klinger', 'AC Blizzard': 'Aiden Blizzard', 'I Malhotra': 'Ishan Malhotra', 'L Ablish': 'Love Ablish', 'CJ Ferguson': 'Callum Ferguson', 'AA Chavan': 'Ankeet Chavan', 'ND Doshi': 'Nayan Doshi', 'Y Gnaneswara Rao': 'Gnaneshwara Rao', 'S Rana': 'Sachin Rana', 'BA Bhatt': 'Bhargav Bhatt', 'RE Levi': 'Richard Levi', 'KK Cooper': 'Kevon Cooper', 'HV Patel': 'Harshal Patel', 'DAJ Bracewell': 'Doug Bracewell', 'GB Hogg': 'Brad Hogg', 'RR Bhatkal': 'Raju Bhatkal', 'CJ McKay': 'Clint McKay', 'N Saini': 'Navdeep Saini', 'Azhar Mahmood': 'Azhar Mahmood', 'RJ Peterson': 'Robin Peterson', 'KMDN Kulasekara': 'Nuwan Kulasekara', 'A Ashish Reddy': 'Ashish Reddy', 'V Pratap Singh': 'Veer Pratap Singh', 'BB Samantray': 'Biplab Samantray', 'MJ Clarke': 'Michael Clarke', 'Gurkeerat Singh': 'Gurkeerat Singh', 'AP Majumdar': 'Anustup Majumdar', 'PA Reddy': 'Akshath Reddy', 'K Upadhyay': 'Krishnakant Upadhyay', 'P Awana': 'Parvinder Awana', 'AD Russell': 'Andre Russell', 'A Chandila': 'Ajit Chandila', 'Sunny Gupta': 'Sunny Gupta', 
               'MC Juneja': 'Manpreet Juneja', 'GH Vihari': 'Hanuma Vihari', 'MDKJ Perera': 'Kusal Perera', 'R Shukla': 'Rahul Shukla', 'B Laughlin': 'Ben Laughlin', 'BMAJ Mendis': 'Jeevan Mendis', 'R Rampaul': 'Ravi Rampaul', 'BJ Rohrer': 'Ben Rohrer', 'KL Rahul': 'KL Rahul', 'Q de Kock': 'Quinton de Kock', 'R Dhawan': 'Rishi Dhawan', 'LJ Wright': 'Luke Wright', 'IC Pandey': 'Ishwar Pandey', 'CM Gautam': 'CM Gautam', 'DJG Sammy': 'Daren Sammy', 'KW Richardson': 'Kane Richardson', 'UA Birla': 'Udit Birla', 'Parvez Rasool': 'Parvez Rasool', 'PV Tambe': 'Pravin Tambe', 'NJ Maddinson': 'Nic Maddinson', 'JDS Neesham': 'James Neesham', 'MA Starc': 'Mitchell Starc', 'BR Dunk': 'Ben Dunk', 'RR Rossouw': 'Rilee Rossouw', 'Shivam Sharma': 'Shivam Sharma', 'VH Zol': 'Vijay Zol', 'BE Hendricks': 'Beuran Hendricks', 'S Gopal': 'Shreyas Gopal', 'M de Lange': 'Marchant de Lange', 'JO Holder': 'Jason Holder', 'Karanveer Singh': 'Karanveer Singh', 'SA Abbott': 'Sean Abbott', 'J Suchith': 'Jagadeesha Suchith', 'RG More': 'Ronit More', 'D Wiese': 'David Wiese', 'SN Khan': 'Sarfaraz Khan', 'DJ Muthuswami': 'Domnic Joseph', 'C Munro': 'Colin Munro', 'P Sahu': 'Pardeep Sahu', 'KJ Abbott': 'Kyle Abbott', 'M Ashwin': 'Murugan Ashwin', 'NS Naik': 'Nikhil Naik', 'PSP Handscomb': 'Peter Handscomb', 'J Yadav': 'Jayant Yadav', 'UT Khawaja': 'Usman Khawaja', 'F Behardien': 'Farhaan Behardien', 'BB Sran': 'Barinder Sran', 'S Kaushik': 'Shivil Kaushik', 
               'ER Dwivedi': 'Eklavya Dwivedi', 'E Lewis': 'Evin Lewis', 'M Wood': 'Mark Wood','K Gowtham': 'Krishnappa Gowtham', 'T Curran': 'Tom Curran', 'M Markande': 'Mayank Markande', 'B Stanlake': 'Billy Stanlake', 'M Ur Rahman': 'Mujeeb Ur Rahman', 'A Dananjaya': 'Akila Dananjaya', 'L Plunkett': 'Liam Plunkett', 'Mustafizur Rahman': 'Mustafizur Rahman', 'A Hales': 'Alex Hales', 'M Lomror': 'Mahipal Lomror', 'D Shorey': 'Dhruv Shorey', 'P Krishna': 'Prasidh Krishna', 'P Chopra': 'Prashant Chopra', 'S Dube': 'Shivam Dube', 'R Salam': 'Rasikh Salam', 'N Pooran': 'Nicholas Pooran', 'N Naik': 'Nikhil Naik', 'H Vihari': 'Hanuma Vihari', 'P R Barman': 'Prayas Ray Barman', 'H Viljoen': 'Hardus Viljoen', 'Avesh Khan': 'Avesh Khan', 'S Lamichhane': 'Sandeep Lamichhane', 'S Sharma': 'Sandeep Sharma', 'H Gurney': 'Harry Gurney', 'SD Lad': 'Siddhesh Lad', 'A Joseph': 'Alzarri Joseph', 'R Parag': 'Riyan Parag', 'M Santner': 'Mitchell Santner', 'J Denly': 'Joe Denly', 'L Livingstone': 'Liam Livingstone', 'K Ahmed': 'Khaleel Ahmed', 'A Turner': 'Ashton Turner', 'H Brar': 'Harpreet Brar', 'S Rutherford': 'Sherfane Rutherford', 
               'P Raj': 'Prithvi Raj', 'AR Bawne': 'Ankit Bawne', 'SP Jackson': 'Sheldon Jackson', 'MJ Guptill': 'Martin Guptill', 'Anureet Singh': 'Anureet Singh', 'DJ Thornely': 'Dominic Thornely', 'AM Nayar': 'Abhishek Nayar', 'SM Pollock': 'Shaun Pollock', 'TM Dilshan': 'Tillakaratne Dilshan', 'AD Mascarenhas': 'Dimitri Mascarenhas', 'Niraj Patel': 'Niraj Patel', 'A Nel': 'André Nel', 'J Theron': 'Rusty Theron', 'SJ Srivastava': 'Shalabh Srivastava', 'SW Tait': 'Shaun Tait', 'C Ganapathy': 'Chandrasekar Ganapathy', 'P Parameswaran': 'Prasanth Parameswaran', 'CA Ingram': 'Colin Ingram', 'TP Sudhindra': 'TP Sudhindra', 'BW Hilfenhaus': 'Benjamin Hilfenhaus', 'Mohammed Siraj': 'Mohammed Siraj', 'H Klaasen': 'Heinrich Klaasen', 'J Archer': 'Jofra Archer', 'R Bhui': 'Ricky Bhui', 'Tejas Baroka': 'Tejas Baroka', 'SS Agarwal': 'Shubham Agarwal', 'MJ Henry': 'Matt Henry', 'P Amarnath': 'Palani Amarnath', 'B Geeves': 'Brett Geeves', 'Gagandeep Singh': 'Gagandeep Singh', 'AM Salvi': 'Aavishkar Salvi', 'RR Bose': 'Ranadeb Bose', 'SS Sarkar': 'Soumya Sarkar', 'AA Kazi': 'Abrar Kazi', 'Anand Rajan': 'Anand Rajan', 'P Prasanth': 'Padmanabhan Prasanth', 'SS Mundhe': 'Shrikant Mundhe', 'RW Price': 'Ray Price', 'Harmeet Singh': 'Harmeet Singh', 'P Suyal': 'Pawan Suyal', 'MG Neser': 'Michael Neser', 'K Santokie': 'Krishmar Santokie', 'JW Hastings': 'John Hastings', 'GS Sandhu': 'Gurinder Sandhu', 
               'T Shamsi': 'Tabraiz Shamsi', 'SM Boland': 'Scott Boland', 'K Khejroliya': 'Kulwant Khejroliya', 'L Ngidi': 'Lungi Ngidi', 'KM Asif': 'KM Asif', 'D Willey': 'David Willey', 'L Ferguson': 'Lockie Ferguson', 'V Chakravarthy': 'Varun Chakravarthy', 'J Behrendorff': 'Jason Behrendorff', 'S Kuggeleijn': 'Scott Kuggeleijn', 'S Midhun': 'Sudhesan Midhun', 'O Thomas': 'Oshane Thomas', 'A Roy': 'Anukul Roy', 'S Warrier': 'Sandeep Warrier', 'U Kaul': 'Uday Kaul', 'TM Dilshan': 'Tillakaratne Dilshan', 'AD Mascarenhas': 'Dimitri Mascarenhas', 'Niraj Patel': 'Niraj Patel', 'VS Yeligati': 'Vikrant Yeligati', 'AA Bilakhia': 'Azhar Bilakhia', 'J Theron': 'Juan Theron', 'SJ Srivastava': 'Shalabh Srivastava', 'R Sharma': 'Rahul Sharma', 'SW Tait': 'Shaun Tait', 'AP Dole': 'Aditya Dole', 'AG Murtaza': 'Ali Murtuza', 'CA Ingram': 'Colin Ingram', 'P Parameswaran': 'Prashanth Parameswaran', 'DJ Harris': 'Daniel Harris', 'MSM Senanayake': 'Sachithra Senanayake', 'X Thalaivan Sargunam': 'Thalaivan Sargunam', 'Mohammed Siraj': 'Mohammed Siraj', 'H Klaasen': 'Heinrich Klaasen', 'R Bhui': 'Ricky Bhui', 'NB Singh': 'Nathu Singh', 'RA Shaikh': 'Rahil Shaikh', 'MB Parmar': 'Mohnsinh Parmar', 'AT Carey': 'Alex Carey', 'Y Prithvi Raj': 'Prithvi Raj', 'CJ Dala': 'Junior Dala', 'SO Hetmyer': 'Shimron Hetmyer', 'Mujeeb Ur Rahman': 'Mujeeb Ur Rahman', 'MK Lomror': 'Mahipal Lomror', 'I Udana': 'Isuru Udana', 'YBK Jaiswal': 'Yashasvi Jaiswal', 
               'TK Curran': 'Tom Curran', 'Shubman Gill': 'Shubman Gill', 'AD Hales': 'Alex Hales', 'Rasikh Salam': 'Rasikh Salam', 'M Prasidh Krishna': 'Prasidh Krishna', 'CV Varun': 'Varun Chakravarthy', 'P Simran Singh': 'Prabhsimran Singh', 'RK Bhui': 'Ricky Bhui', 'Abdul Samad': 'Abdul Samad', 'DR Sams': 'Daniel Sams', 'JP Behrendorff': 'Jason Behrendorff', 'AS Roy': 'Anukul Roy', 'JM Bairstow': 'Jonny Bairstow', 'N Jagadeesan': 'Narayan Jagadeesan', 'SS Cottrell': 'Sheldon Cottrell', 'MM Ali': 'Moeen Ali', 'Ravi Bishnoi': 'Ravi Bishnoi', 'LE Plunkett': 'Liam Plunkett', 'IS Sodhi': 'Ish Sodhi', 'Kartik Tyagi': 'Kartik Tyagi', 'AS Joseph': 'Alzarri Joseph', 'Shivam Mavi': 'Shivam Mavi', 'JR Philippe': 'Josh Philippe', 'SE Rutherford': 'Sherfane Rutherford', 'KL Nagarkoti': 'Kamlesh Nagarkoti', 'Monu Kumar': 'Monu Kumar', 'SC Kuggeleijn': 'Scott Kuggeleijn', 'KMA Paul': 'Keemo Paul', 'LS Livingstone': 'Liam Livingstone', 'MA Wood': 'Mark Wood', 'SM Curran': 'Sam Curran', 'P Dubey': 'Praveen Dubey', 'GC Viljoen': 'Hardus Viljoen', 'MJ Santner': 'Mitchell Santner', 'RD Gaikwad': 'Ruturaj Gaikwad', 'CJ Green': 'Chris Green', 'DR Shorey': 'Dhruv Shorey', 'JL Denly': 'Joe Denly', 'JL Pattinson': 'James Pattinson', 'JC Archer': 'Jofra Archer', 'T Banton': 'Tom Banton', 'S Sandeep Warrier': 'Sandeep Warrier', 'JR Hazlewood': 'Josh Hazlewood', 'Shahbaz Ahmed': 'Shahbaz Ahmed', 'AJ Turner': 'Ashton Turner', 'Arshdeep Singh': 'Arshdeep Singh', 'KK Ahmed': 'Khaleel Ahmed', 
               'JPR Scantlebury-Searles': 'Javon Searles', 'HF Gurney': 'Harry Gurney', 'DJ Willey': 'David Willey', 'Abhishek Sharma': 'Abhishek Sharma', 'A Nortje': 'Anrich Nortje', 'Harpreet Brar': 'Harpreet Brar', 'TU Deshpande': 'Tushar Deshpande', 'D Padikkal': 'Devdutt Padikkal', 'PP Shaw': 'Prithvi Shaw', 'SMSM Senanayake': 'Sachithra Senanayake', 'RK Singh': 'Rinku Singh', 'NA Saini': 'Navdeep Saini', 'DJM Short': "D'Arcy Short", 'PK Garg': 'Priyam Garg', 'P Ray Barman': 'Prayas Barman'}

Let us write our final function :)

In [None]:
def change_name_format(matches, player_names, umpire_names):
    matches.replace(player_names, inplace = True)
    matches.replace(umpire_names, inplace = True)

In [None]:
change_name_format(matches, player_names, umpire_names)
matches.head()

Unnamed: 0_level_0,city,date,player_of_match,venue,neutral_venue,team1,team2,toss_winner,toss_decision,winner,result,result_margin,method,umpire1,umpire2,is_playoff,is_eliminator,is_qualifier,is_final,home_of
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
335982,Bangalore,2008-04-18,Brendon McCullum,M.Chinnaswamy Stadium,0,Royal Challengers Bangalore,Kolkata Knight Riders,Royal Challengers Bangalore,field,Kolkata Knight Riders,runs,140,,Asad Rauf,Rudi Koertzen,0,0,0,0,Royal Challengers Bangalore
335983,Chandigarh,2008-04-19,Michael Hussey,Punjab Cricket Association IS Bindra Stadium,0,Kings XI Punjab,Chennai Super Kings,Chennai Super Kings,bat,Chennai Super Kings,runs,33,,Mark Benson,Suresh Shastri,0,0,0,0,Kings XI Punjab
335984,Delhi,2008-04-19,Farveez Maharoof,Arun Jaitley Stadium,0,Delhi Capitals,Rajasthan Royals,Rajasthan Royals,bat,Delhi Capitals,wickets,9,,Aleem Dar,G. A. Pratapkumar,0,0,0,0,Delhi Capitals
335985,Mumbai,2008-04-20,Mark Boucher,Wankhede Stadium,0,Mumbai Indians,Royal Challengers Bangalore,Mumbai Indians,bat,Royal Challengers Bangalore,wickets,5,,Steve Davis,Daryl Harper,0,0,0,0,Mumbai Indians
335986,Kolkata,2008-04-20,David Hussey,Eden Gardens,0,Kolkata Knight Riders,Deccan Chargers,Deccan Chargers,bat,Kolkata Knight Riders,wickets,5,,Billy Bowden,Krishna Hariharan,0,0,0,0,Kolkata Knight Riders


Now, it's more readbale and easy to interpret, right? 

So, we have completed the Feature Engineering of `ipl-matches` dataset. Let us save it with .csv extension.

In [None]:
matches.to_csv('/content/drive/MyDrive/Study/Projects/IPL-Data-Analysis/Datasets/Processed/matches.csv')

# Deliveries Datasets

Let us quickly apply some functions which we have defined earlier in this notebook to preprocess `deliveries` dataset.

## Loading the Datasets

In [None]:
deliveries = pd.read_csv('/content/drive/MyDrive/Study/Projects/IPL-Data-Analysis/Datasets/Raw/IPL-ball-by-ball.csv', index_col='id')
deliveries.head()

Unnamed: 0_level_0,inning,over,ball,batsman,non_striker,bowler,batsman_runs,extra_runs,total_runs,non_boundary,is_wicket,dismissal_kind,player_dismissed,fielder,extras_type,batting_team,bowling_team
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
335982,1,6,5,RT Ponting,BB McCullum,AA Noffke,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,6,6,BB McCullum,RT Ponting,AA Noffke,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,1,BB McCullum,RT Ponting,Z Khan,0,0,0,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,2,BB McCullum,RT Ponting,Z Khan,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,3,RT Ponting,BB McCullum,Z Khan,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore


## Data Preprocessing

We just need to replace few team names as we did earlier and change the player's names format. Let us call the functions which we have defined earlier to do the same.

In [None]:
columns_to_change = ['batting_team', 'bowling_team']
replace_team_names(deliveries, columns_to_change)
change_name_format(deliveries, player_names, umpire_names)
deliveries.head()

Unnamed: 0_level_0,inning,over,ball,batsman,non_striker,bowler,batsman_runs,extra_runs,total_runs,non_boundary,is_wicket,dismissal_kind,player_dismissed,fielder,extras_type,batting_team,bowling_team
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
335982,1,6,5,Ricky Ponting,Brendon McCullum,Ashley Noffke,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,6,6,Brendon McCullum,Ricky Ponting,Ashley Noffke,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,1,Brendon McCullum,Ricky Ponting,Zaheer Khan,0,0,0,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,2,Brendon McCullum,Ricky Ponting,Zaheer Khan,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore
335982,1,7,3,Ricky Ponting,Brendon McCullum,Zaheer Khan,1,0,1,0,0,,,,,Kolkata Knight Riders,Royal Challengers Bangalore


It's ready now. Let us just export it as csv file.

In [None]:
deliveries.to_csv('/content/drive/MyDrive/Study/Projects/IPL-Data-Analysis/Datasets/Processed/deliveries.csv')