<a href="https://www.kaggle.com/code/aniketpawar22/ipl-auction-eda?scriptVersionId=125028689" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# IPL 2023 Auction EDA

**As we know BCCI just now concluded the IPL 2023 auction and some of players got very high price so I will generate key insights from data using Pandas to clean data and then will use Matplotlib,Plotly for visualization****

# ****Importing required libraries****

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

# **Import the required datasets**

In [2]:
df=pd.read_csv('/kaggle/input/ipl-auction-2022/2023 Auction/IPL_2023_Auction_Sold.csv')
df.head()

Unnamed: 0,Set No.,2023 Set,First Name,Surname,Country,Association,DOB,Age,Specialism,Batting_Style,...,ODI caps,T20 caps,IPL,Previous IPLTeam(s),2022 Team,2022 IPL,C/U/A,Reserve_Price,TEAM,Auction_Price
0,1,BA1,Mayank,Agarwal,India,KSCA,16/02/1991,32,BATSMAN,RHB,...,5,0,113,"RCB, DD, RPSG, PBKS",PBKS,13,Capped,100,SRH,825.0
1,1,BA1,Harry,Brook,England,F,22/02/1999,24,BATSMAN,RHB,...,0,20,0,Do No Play,Do No Play,0,Capped,150,SRH,1325.0
2,1,BA1,Ajinkya,Rahane,India,MCA,06/06/1988,34,BATSMAN,RHB,...,90,20,158,"MI, RPSG, RR, DC, KKR",KKR,7,Capped,50,CSK,50.0
3,1,BA1,Joe,Root,England,F,30/12/1990,32,BATSMAN,RHB,...,158,32,0,Do No Play,Do No Play,0,Capped,100,RR,100.0
4,1,BA1,Rilee,Rossouw,South Africa,F,09/10/1989,33,BATSMAN,LHB,...,36,26,5,RCB,Do No Play,0,Capped,200,DC,460.0


**From above dataset we will remove the unwanted columns whoch are not needed for our analysis**

In [3]:
df.drop(['Set No.','2023 Set','Association','DOB','Batting_Style','ODI caps','T20 caps','Previous IPLTeam(s)'],axis=1,inplace=True)

In [4]:
df.head()

Unnamed: 0,First Name,Surname,Country,Age,Specialism,Bowling_Style,Test caps,IPL,2022 Team,2022 IPL,C/U/A,Reserve_Price,TEAM,Auction_Price
0,Mayank,Agarwal,India,32,BATSMAN,RIGHT ARM Off Spin,21,113,PBKS,13,Capped,100,SRH,825.0
1,Harry,Brook,England,24,BATSMAN,RIGHT ARM Medium,1,0,Do No Play,0,Capped,150,SRH,1325.0
2,Ajinkya,Rahane,India,34,BATSMAN,-,82,158,KKR,7,Capped,50,CSK,50.0
3,Joe,Root,England,32,BATSMAN,RIGHT ARM Off Spin,124,0,Do No Play,0,Capped,100,RR,100.0
4,Rilee,Rossouw,South Africa,33,BATSMAN,-,0,5,Do No Play,0,Capped,200,DC,460.0


# Data Preparation & Cleaning

**I am combining the first name and surname columns into single player name columns which will be easy to read and also will save memory**

In [5]:
df['Player Name']=df[['First Name','Surname']].apply(lambda x:' '.join(x),axis=1)
df.drop(['First Name','Surname'],axis=1,inplace=True)

**Will look for unique values**

In [6]:
country=df['Country'].unique()
print("From following countries the players participated -",country)

From following countries the players participated - ['India' 'England' 'South Africa' 'New Zealand' 'Australia' 'Bangladesh'
 'West Indies' 'Zimbabwe' 'Sri Lanka' 'Afghanistan' 'Ireland' 'Netherland'
 'UAE' 'Namibia']


In [7]:
player_speciality=df['Specialism'].unique()
print("We have following specialisation from each player -",player_speciality)

We have following specialisation from each player - ['BATSMAN' 'ALL-ROUNDER' 'WICKETKEEPER' 'BOWLER']


In [8]:
bowling_style=df['Bowling_Style'].unique()
bowling_style

array(['RIGHT ARM Off Spin', 'RIGHT ARM Medium', '-',
       'LEFT ARM Fast Medium', 'RIGHT ARM Fast', 'LEFT ARM Slow Orthodox',
       'RIGHT ARM Fast Medium', 'LEFT ARM Fast', 'RIGHT ARM Leg Spin',
       'LEFT ARM Slow Unorthodox', 'LEFT ARM Medium'], dtype=object)

**Teams participating in IPL Auction 2023**

In [9]:
df['TEAM'].unique()

array(['SRH', 'CSK', 'RR', 'DC', 'GT', 'PBKS', 'MI', 'KKR', 'UnSold',
       'LSG', 'RCB'], dtype=object)

**Here UnSold is wrong value so we will drop that**

In [10]:
df.drop(df[df['TEAM']=='UnSold'].index,inplace=True)

In [11]:
t=df['TEAM'].unique()
print('Following teams were part of IPL Auction 2023 -',t)

Following teams were part of IPL Auction 2023 - ['SRH' 'CSK' 'RR' 'DC' 'GT' 'PBKS' 'MI' 'KKR' 'LSG' 'RCB']


# Data Visualization & Question Answering

**1. **From which country players earned more money****

In [12]:
money=df.groupby('Player Name')['Auction_Price'].mean()
top10_players=money.sort_values(ascending=False).head(10)
top10_players

Player Name
Sam Curran          1850.0
Cameron Green       1750.0
Ben Stokes          1625.0
Nicholas Pooran     1600.0
Harry Brook         1325.0
Mayank Agarwal       825.0
Shivam Mavi          600.0
Jason Holder         575.0
Mukesh Kumar         550.0
Heinrich Klaasen     525.0
Name: Auction_Price, dtype: float64

In [13]:
fig=px.bar(x=top10_players.index,y=top10_players.values,color=top10_players.index,text=top10_players.values,title='Top10 Paid Players')
fig.update_layout(xaxis_title='Players',yaxis_title='Amount in Lakhs')

**Here we can see that top 3 paid players are Sam Curran,Cameron Green & Ben Stokes who all are foreign country players as well as All rounders so this years IPL auction was dominated by foreign players rather than Indian Player in terms of money and teams are ready to pay heavy amount if player is All rounder**

**2. **How many players are capped and Uncapped.Here Capped means played internation match and Uncapped means et to play internation match****

In [14]:
a=df['C/U/A'].value_counts()
fig1=px.pie(a,names=a.index,values=a.values,title='Capped/Uncapped Players')
fig1.update_traces(textposition='inside',textinfo='percent + label')

**Nearly 70% players were uncapped means yet to make international apperance who were part of IPL Auction 2023**

**3. **What is percentage speciality of players****

In [15]:
b=df['Specialism'].value_counts()
fig2=px.pie(b,names=b.index,values=b.values,title='Speciality of Players')
fig2.update_traces(textposition='inside',textinfo='percent + label')

**Most of the players are either All rounders or Bowlers followed by Batsman & Wicketkeeper**

**4. Money paid according to player type**

In [16]:
c=df.groupby('Specialism')['Auction_Price'].mean()
fig3=px.bar(c,x=c.values,y=c.index,title='Average Money spent according to Player Type')
fig3.update_layout(xaxis_title='Avg. price in Lakhs')

**Batsmans got paid paid heavily followed by All rounders & WeeketKeepers**

**5. **How much money each team spent****

In [17]:
d=df.groupby('TEAM')['Auction_Price'].mean()
fig4=px.bar(d,x=d.index,y=d.values,color=d.values,text=d.index,title='Money spent by teams in Auction')
fig4.update_layout(xaxis_title='Teams',yaxis_title='Avg money in Lakhs')

**PBKS(Punjab)spent the most money in this years IPL auction**

**Country wise paid players**

In [18]:
df.drop(df[df['Auction_Price']< 0.0].index,inplace=True)

In [19]:
country=df.groupby('Country')['Auction_Price'].mean()
z=country.head()
fig5=px.bar(x=z.index,y=z.values,title='Countries with top5 paid players',text=z.index)
fig5.update_layout(xaxis_title='Country',yaxis_title='Price in Lakhs')

**England players got high biddings followed by Australia**