#**T20 World Cup 2022 Analysis**

##**Importing Libraries and Loading Dataset**

In [1]:
# Mounting
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [2]:
# importing libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio

In [3]:
#reading dataset
path = '/content/gdrive/MyDrive/ML/'
file = path + 't20-world-cup-22.csv'
t20 = pd.read_csv(file)
t20.head()

Unnamed: 0,venue,team1,team2,stage,toss winner,toss decision,first innings score,first innings wickets,second innings score,second innings wickets,winner,won by,player of the match,top scorer,highest score,best bowler,best bowling figure
0,SCG,New Zealand,Australia,Super 12,Australia,Field,200.0,3.0,111.0,10.0,New Zealand,Runs,Devon Conway,Devon Conway,92.0,Tim Southee,3-6
1,Optus Stadium,Afghanistan,England,Super 12,England,Field,112.0,10.0,113.0,5.0,England,Wickets,Sam Curran,Ibrahim Zadran,32.0,Sam Curran,5-10
2,Blundstone Arena,Ireland,Sri lanka,Super 12,Ireland,Bat,128.0,8.0,133.0,1.0,Sri lanka,Wickets,Kusal Mendis,Kusal Mendis,68.0,Maheesh Theekshana,2-19
3,MCG,Pakistan,India,Super 12,India,Field,159.0,8.0,160.0,6.0,India,Wickets,Virat Kohli,Virat Kohli,82.0,Hardik Pandya,3-30
4,Blundstone Arena,Bangladesh,Netherlands,Super 12,Netherlands,Field,144.0,8.0,135.0,10.0,Bangladesh,Runs,Taskin Ahmed,Colin Ackermann,62.0,Taskin Ahmed,4-25


In [4]:
t20.columns

Index(['venue', 'team1', 'team2', 'stage', 'toss winner', 'toss decision',
       'first innings score', 'first innings wickets', 'second innings score',
       'second innings wickets', 'winner', 'won by', 'player of the match',
       'top scorer', 'highest score', 'best bowler', 'best bowling figure'],
      dtype='object')

The dataset contains data about all the matches from the super 12 stage to the final of the ICC Men’s T20 World Cup 2022.


> Below are all the features in the dataset:





1. venue: The venue where the match was played
2. team1: The team that batted first
3. team2: The team that batted second
4. stage: Stage of the match (super 12, semi-final, or final)
5. toss winner: The team that won the toss
6. toss decision: The decision of the captain after winning the toss(Bat or Field)
7. first innings score: Runs scored in the first innings
8. first innings wickets: The number of wickets lost in the first innings
9. second innings score: Runs scored in the second innings
10. second innings wickets: The number of wickets lost in the second innings
11. winner: The team that won the match
12. won by: Indicates Whether the team that won was batting first or second (wickets or runs)
13. player of the match: The player of the match
14. top scorer: The player who scored maximum runs in the match
15. highest score: The highest runs scored in the match by the player
16. best bowler: The player who took the most wickets in the match
17. best bowling figure: The number of wickets taken and runs given by the best bowler in the match

In [5]:
# check for Nan Values
t20.isnull().sum()

venue                     0
team1                     0
team2                     0
stage                     0
toss winner               3
toss decision             3
first innings score       3
first innings wickets     3
second innings score      3
second innings wickets    3
winner                    4
won by                    4
player of the match       4
top scorer                3
highest score             3
best bowler               3
best bowling figure       3
dtype: int64

In [6]:
missing_values = t20['toss winner'].isna()
t20[missing_values]

Unnamed: 0,venue,team1,team2,stage,toss winner,toss decision,first innings score,first innings wickets,second innings score,second innings wickets,winner,won by,player of the match,top scorer,highest score,best bowler,best bowling figure
8,MCG,New Zealand,Afghanistan,Super 12,,,,,,,,,,,,,
12,MCG,Afghanistan,Ireland,Super 12,,,,,,,,,,,,,
13,MCG,Australia,England,Super 12,,,,,,,,,,,,,


###**Some of the matches were cancelled and hence the missing values.**
###**Let us move ahead with the EDA and let us start by looking at the number of matches won by each team.**

#**Exploratory Data Analysis**

##**1| Number of Matches Won by each team**

In [7]:
figure = px.bar(t20, x=t20['winner'], title='Number of Matches won by teams in T20 world cup 2022')
figure.show()

### As we know, England won the T20 world cup 2022 final and hence have won most number of matches(five), with India and Pakistan winning four macthes each.

### Now, lets look at the toss decisions i.e. Decision(Bat or Field) taken by the teams after winning the toss.


##**2| Toss Decisions**

In [8]:
toss  = t20["toss decision"].value_counts()
label = toss.index
counts = toss.values
colors = ['green','blue']

In [9]:
fig = px.pie(toss, values=counts, names=label, title='Toss Decisions by team')
fig.update_traces(hoverinfo='label+percent',
                  textposition='inside',
                  textinfo='value+label',
                  textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3))
                 )
fig.show()

### Hence, out of 30 matches, 17 times, the teams decided to bat first while 13 times, they decided to field first.
### Now, lets have a look at the number of matches won by teams batting first and fielding first.

##**3| Number of Matches won by team batting first and fielding first.**

In [10]:
won_by = t20['won by'].value_counts()
label = won_by.index
counts = won_by.values
colors = ['yellow','red']

In [11]:
fig = px.pie(won_by, values=counts, names=label, title='Number of matches won by Runs or Wickets')
fig.update_traces(hoverinfo='label+percent',
                  textposition='inside',
                  textinfo='value+label',
                  textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3))
                 )
fig.show()

### Hence, in the T20 world cup, 16 matches were won by teams batting first and 13 matches were won by teams batting second.
### Let us now look at the number of matches won after winning the toss and elected to bat first.

##**4| Number of matches won by teams after they elected to bat first**

In [12]:
new_df = t20.loc[(t20['toss decision']=='Bat')]
new_df = new_df['won by'].value_counts()
label = new_df.index
counts = new_df.values
colors = ['purple','gold']

In [13]:
fig = px.pie(won_by, values=counts, names=label, title='Number of matches won by teams after they elected to bat first')
fig.update_traces(hoverinfo='label+percent',
                  textposition='inside',
                  textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3))
                 )
fig.show()

### Hence, out of 16 times when teams elected to bat first after winning the toss, only 8 times they were able to defend the total score.

##**5| Number of matches won by teams after they elected to Field first**

In [14]:
new_df = t20.loc[(t20['toss decision']=='Field')]
new_df = new_df['won by'].value_counts()
label = new_df.index
counts = new_df.values
colors = ['orange','blue']

In [15]:
fig = px.pie(won_by, values=counts, names=label, title='Number of matches won by teams after they elected to Field first')
fig.update_traces(hoverinfo='label+percent',
                  textposition='inside',
                  textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3))
                 )
fig.show()

### Hence, out of total 13 times where teams elected to field first, only 5 times they were able to chase the total.

###Next, lets look at the Top Scorers of the tournament.

##**6| Top Scorers**

In [16]:
figure = px.bar(t20,
                x=t20['top scorer'],
                y=t20['highest score'],
                color=t20['highest score'],
                title='Top scorers of the T20 world cup 2022'
               )
figure.show()


###1.  Hence, Virat Kohli was the highest run scorer in three matches which is most by any batsman in the T20 world cup 2022.

###2.   Also , Rilee Rossouw became the batsman who scored maximum runs in a single match.




###Now, let us have a look at the player of the match awards in the world cup.

##**7| Player of the match award for most number of times**

In [17]:
figure = px.bar(t20,
                x=t20['player of the match'],
                title='Player of the Match Awards in T20 World Cup 2022'
               )
figure.show()

Therefore, Virat Kohli, Sam Curran, Taskin Ahmed, Suryakumar Yadav and Shadab Khan, all of them were awarded with the player of the match award twice.

##**8| Best Bowler award for most number of times**

In [18]:
figure = px.bar(t20,
                x= t20['best bowler'],
                title = 'Best Bowler for most number of times')
figure.show()

### Hence, Sam Curran was the awarded with the best bowler trophy thrice, which is the most by any bowler.
### No Wonder, why Sam Curran was awarded with the Player of the tournament award.

##**9| Total Runs scored in First and Second innings in each Stadium.**

In [19]:
fig = go.Figure()
fig.add_trace(go.Bar(x=t20['venue'],
                     y=t20['first innings score'],
                     name='First innings runs',
                     marker_color='blue'
                    ))
fig.add_trace(go.Bar(x=t20['venue'],
                     y=t20['second innings score'],
                     name='Second innings runs',
                     marker_color='red'
                    ))
fig.update_layout(barmode='group',
                  xaxis_tickangle=-45,
                  title="Best Stadiums to Bat First or Chase"
                 )



### Hence, we can say that SCG was the stadium which was optimal to bat first.

#**Key Takeaways**


1.   ENGLAND won the most number of matches(five), with INDIA and PAKISTAN on the second place winning 4 matches each.
2.   Out of 30 macthes played, 17 times teams elected to bat first.
3. Virat Kohli scored the maximum runs most number of times.
4. Sam Curran was the best bowler in the most number of matches.
5. More number of teams won by batting first.
6. Virat Kohli, Sam Curran, Taskin Ahmed, Suryakumar Yadav and Shadab Khan were key players for their respective team winning player of the match award twice.
7. Rilee Rossouw scored the maximum runs in a single match.
8. Out of total 13 times where teams elected to field first, only 5 times they were able to chase the total.
9. Out of 16 times when teams elected to bat first after winning the toss, only 8 times they were able to defend the total score.
10. SCG was the stadium which was optimal to bat first.
11. The Optus Stadium was the best in order to bowl first.

