# IPL 2022 Auction Analysis

### Importing Libraries

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

### Reading Data

In [2]:
df = pd.read_csv('../input/2022-ipl-auction-dataset/ipl_2022_dataset.csv')

In [3]:
df.shape

In [4]:
df.columns

**How our Data Looks ...**

In [5]:
df.sample()

In [6]:
df.drop('Unnamed: 0', axis =1, inplace = True)

**Information of Data**

In [7]:
df.info()

In [8]:
df.isnull().sum()

**Treating Null Values in 'COST IN ₹ (CR.)' and 'Cost IN $ (000)'**

In [9]:
df[df['Cost IN $ (000)'].isnull()]

*These are the Players which went Unsold in 2022 Auctions so their Cost we can replace with ZERO*

In [10]:
df['COST IN ₹ (CR.)'] = df['COST IN ₹ (CR.)'].fillna(0)
df['Cost IN $ (000)'] = df['Cost IN $ (000)'].fillna(0)

**Treating Null Values in 2021 Squad**

In [11]:
df[df['2021 Squad'].isnull()]

**These are the Players who either went Unsold in 2021 IPL or participating for the first time in IPL**

In [12]:
df['2021 Squad'] = df['2021 Squad'].fillna('Not Participated in IPL 2021')

In [13]:
df.isnull().sum()

**We replaced all the null values**

**Now we will add and Adjust few columns for further Analysis**

In [14]:
teams = df[df['COST IN ₹ (CR.)']>0]['Team'].unique()
teams

In [15]:
df['status'] = df['Team'].replace(teams,'sold')

In [16]:
df['Base Price'].unique()

In [17]:
df['retention'] = df['Base Price']

In [18]:
df['retention'].replace(['2 Cr', '40 Lakh', '20 Lakh', '1 Cr', '75 Lakh',
       '50 Lakh', '30 Lakh','1.5 Cr'],'In Auction', inplace = True)

*Treating Base Price Column*

In [19]:
df['Base Price'].replace('Draft Pick',0, inplace = True)

In [20]:
df['base_price_unit'] = df['Base Price'].apply(lambda x: str(x).split(' ')[-1])
df['base_price'] = df['Base Price'].apply(lambda x: str(x).split(' ')[0])

In [21]:
df['base_price'].replace('Retained',0,inplace=True)

In [22]:
df['base_price_unit'].unique()

In [23]:
df['base_price_unit'] = df['base_price_unit'].replace({'Cr':100,'Lakh':1,'Retained':0})

In [24]:
df['base_price'] = df['base_price'].astype(float)
df['base_price_unit'] = df['base_price_unit'].astype(int)

In [25]:
df['base_price'] = df['base_price']*df['base_price_unit']

In [26]:
df.head()

In [27]:
df.drop(['Base Price','base_price_unit'], axis =1, inplace = True)

In [28]:
df

In [29]:
df['COST IN ₹ (CR.)'] = df['COST IN ₹ (CR.)']*100

In [30]:
df = df.rename(columns={'TYPE':'Type','COST IN ₹ (CR.)':'Sold_for_lakh','Cost IN $ (000)':'Cost_in_dollars','2021 Squad':'Prev_team','Team':'Curr_team'})

In [31]:
df.head()

**We will Check Duplicate Players**

In [32]:
df[df['Player'].duplicated(keep=False)]

> *Yes There are duplicate rows but the fact is that the names of players are same but the Players are different*

### Our Data is Ready for Analysis

#### Now I have a list of Questions asked by friend and I will be tring to answer those questions below.

****

**1. How many players participated in the Auction 2022 ?**

In [33]:
df.shape[0]

> *There wre 633 Players appeared for TATA IPL 2022*

**2. Participation based on the Role(Batsman, Bowlers, Allrounders and WK)**

In [34]:
types = df['Type'].value_counts()
types.reset_index()

In [35]:
plt.pie(types.values, labels=types.index,labeldistance=1.2,autopct='%1.2f%%')
plt.title('Role of Players Participated', fontsize = 15)
plt.plot()

> *Maximum Players in Auction were All-Rounders Followed by Bowlers,Batter and WicketKeepers*

**3. How many Players Were Sold in IPL 2022 Auctions ?**

In [36]:
plt.figure(figsize=(5,5))
fig = sns.countplot(df['status'],palette=['Green','Red'])
plt.xlabel('Sold or Unsold')
plt.ylabel('Number of Players')
plt.title('Sold vs Unsold', fontsize=15)
plt.plot()

for p in fig.patches:
    fig.annotate(format(p.get_height(), '.0f'), (p.get_x() + p.get_width()/2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 4), textcoords = 'offset points')

> *In Auction 237 were Sold and 396 were Unsold.*

**4. How many Players Brought by Each Team**

In [37]:
df.sample()

In [38]:
plt.figure(figsize=(20,10))
fig = sns.countplot(df[df['Curr_team']!='Unsold']['Curr_team'])
plt.xlabel('Name of Team')
plt.ylabel('Number of Players')
plt.title('Players Brought by each Team', fontsize=15)
plt.xticks(rotation=90)
plt.plot()

for p in fig.patches:
    fig.annotate(format(p.get_height(), '.0f'), (p.get_x() + p.get_width()/2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 4), textcoords = 'offset points')

**5. How many players Retained/DraftPicked by Each team ?**

In [39]:
df.groupby(['Curr_team','retention'])['retention'].count()[:-1]

**5. How many Players were Braught for each Role**

In [40]:
df.groupby(['Type','status'])['Player'].count().reset_index()

**6. Which Are the players who participated in IPL 2021 and will be participating in IPL 2022 and playing in same team**

In [41]:
df.replace({'SRH':'Sunrisers Hyderabad','CSK':'Chennai Super Kings','MI':'Mumbai Indians','KKR':'Kolkata Knight Riders','RR':'Rajasthan Royals','PBKS':'Punjab Kings','DC':'Delhi Capitals','RCB':'Royal Challengers Bangalore'},inplace =True)

In [42]:
same_team = df[(df['Curr_team']==df['Prev_team']) & (df['retention']=='In Auction')]
same_team

In [43]:
same_team[same_team.Curr_team=='Royal Challengers Bangalore']

**Let's Visualize this team wise**

In [44]:
plt.figure(figsize=(10,8))
sns.countplot(same_team['Curr_team'])
plt.title('Players Who brough by their 2021 teams in Auction ')
plt.xlabel('Name of Team')
plt.ylabel('Number of Player')
plt.xticks(rotation = 90)
plt.grid(axis='y')
plt.plot()

**7. Number of players in each team based on thier roles**

In [45]:
plt.figure(figsize=(20,10))
fig = sns.countplot(df[df['Curr_team']!='Unsold']['Curr_team'],hue=df['Type'])
plt.title('Players in Each Team')
plt.xlabel('Name of Team')
plt.ylabel('Number of Player')


plt.xticks(rotation = 60)

for p in fig.patches:
    fig.annotate(format(p.get_height(), '.0f'), (p.get_x() + p.get_width()/2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 4), textcoords = 'offset points')

 > **Observation:**

*- This year Sunrisers Hydrabad and Rajasthan Royals are BOWLER Dominated Teams*

**8. Heighest Bid(successful ) by each team**

In [58]:
df[df['retention']=='In Auction'].groupby(['Curr_team'])['Sold_for_lakh'].max()[:-1].sort_values(ascending = False)

**9. Top Five Batsman picked from Auction**

In [65]:
df[(df['retention']=='In Auction') & (df['Type']=='BATTER')].sort_values(by='Sold_for_lakh', ascending = False).head(5)

**10. Heighest Paid Retained player**

In [70]:
df[df['retention']=='Retained'].sort_values(by = 'Sold_for_lakh', ascending = False).head(1)

> Sir Jadeja It is 

**11. Amount Spent by each team in Auction**

In [82]:
amount_spent = df.groupby('Curr_team')['Sold_for_lakh'].sum()[:-1]
amount_spent

In [96]:
plt.figure(figsize=(15,5))
sns.barplot('Curr_team','Sold_for_lakh', data = df[df['Curr_team']!='Unsold'])
plt.xticks(rotation=60)
plt.ylabel('Ammount Spent')
plt.legend()

**12. List of players who Played IPL 2021 but went unsold this time**

In [105]:
unsold_stars = df[(df.Prev_team != 'Not Participated in IPL 2021') & (df.Curr_team == 'Unsold')][['Player','Prev_team']]

In [106]:
unsold_stars

In [107]:
unsold_stars.groupby('Prev_team')['Player'].count()

## Thank you for making it to end of this notebook !

**Tell your questions and Discussion Section and I will Try to answer those.**