# IPL Dataset Analysis

## Problem Statement
We want to know as to what happens during an IPL match which raises several questions in our mind with our limited knowledge about the game called cricket on which it is based. This analysis is done to know as which factors led one of the team to win and how does it matter.

## About the Dataset :
The Indian Premier League (IPL) is a professional T20 cricket league in India contested during April-May of every year by teams representing Indian cities. It is the most-attended cricket league in the world and ranks sixth among all the sports leagues. It has teams with players from around the world and is very competitive and entertaining with a lot of close matches between teams.

The IPL and other cricket related datasets are available at [cricsheet.org](https://cricsheet.org/%c2%a0(data). Feel free to visit the website and explore the data by yourself as exploring new sources of data is one of the interesting activities a data scientist gets to do.

### Analysing data with advanced python operations

In [2]:
import json

In [3]:
with open('./data/ipl_match.json') as f:
    data = json.load(f)

In [4]:
data

{'meta': {'data_version': 0.9, 'created': '2011-05-06', 'revision': 2},
 'info': {'city': 'Bangalore',
  'competition': 'IPL',
  'dates': ['2008-04-18'],
  'gender': 'male',
  'match_type': 'T20',
  'outcome': {'by': {'runs': 140}, 'winner': 'Kolkata Knight Riders'},
  'overs': 20,
  'player_of_match': ['BB McCullum'],
  'teams': ['Royal Challengers Bangalore', 'Kolkata Knight Riders'],
  'toss': {'decision': 'field', 'winner': 'Royal Challengers Bangalore'},
  'umpires': ['Asad Rauf', 'RE Koertzen'],
  'venue': 'M Chinnaswamy Stadium'},
 'innings': [{'1st innings': {'team': 'Kolkata Knight Riders',
    'deliveries': [{'0.1': {'batsman': 'SC Ganguly',
       'bowler': 'P Kumar',
       'extras': {'legbyes': 1},
       'non_striker': 'BB McCullum',
       'runs': {'batsman': 0, 'extras': 1, 'total': 1}}},
     {'0.2': {'batsman': 'BB McCullum',
       'bowler': 'P Kumar',
       'non_striker': 'SC Ganguly',
       'runs': {'batsman': 0, 'extras': 0, 'total': 0}}},
     {'0.3': {'batsman

We can further deep dive into this data to find out more information about batsman and bowlers

### Can you find how many deliveries were faced by batsman  `SC Ganguly`.

In [5]:
first_innings_deliveries=data['innings'][0]['1st innings']['deliveries']
print(first_innings_deliveries)

[{'0.1': {'batsman': 'SC Ganguly', 'bowler': 'P Kumar', 'extras': {'legbyes': 1}, 'non_striker': 'BB McCullum', 'runs': {'batsman': 0, 'extras': 1, 'total': 1}}}, {'0.2': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.3': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'extras': {'wides': 1}, 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 1, 'total': 1}}}, {'0.4': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.5': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.6': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.7': {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'extras': {'legbyes': 1}, 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0,

In [None]:
count_deliveries=0
for faced in first_innings_deliveries:
    for deliveries_number, delivery_info in faced.items():
        if delivery_info['batsman']=="SC Ganguly":
            count_deliveries+=1
print("The deliveries faced by SC Ganguly was:", count_deliveries)

### Who was man of the match and how many runs did he scored ?

In [6]:
mom=data['info']['player_of_match']
print("man of the match is:",mom)

man of the match is: ['BB McCullum']


In [10]:
runs_scored=0
for runs in first_innings_deliveries:
    for deliveries_number, delivery_info in runs.items():
        if delivery_info['batsman']=='BB McCullum':
            runs_scored+=delivery_info['runs']['batsman']
print("total runs scored by", runs_scored)

total runs scored by 158


### Which batsman played in the first inning?

In [11]:
batsman=[]
for all_batsman in first_innings_deliveries:
    for delivery_number, delivery_info in all_batsman.items():
        batsman.append(delivery_info['batsman'])

In [12]:
print("all the batsman who played in 1st innings are:", set(batsman))

all the batsman who played in 1st innings are: {'RT Ponting', 'Mohammad Hafeez', 'DJ Hussey', 'SC Ganguly', 'BB McCullum'}


### Which batsman had the most no. of sixes in first inning ?

In [14]:
most_sixes=[]
for sixes in first_innings_deliveries:
    for deliveries_number, delivery_info in sixes.items():
        if 'runs' in delivery_info and delivery_info['runs']['batsman']==6:
            most_sixes.append(delivery_info['batsman'])

In [22]:
from collections import Counter
batsman_sixes=Counter(most_sixes)
name=list(batsman_sixes.keys())
print(name[0])

BB McCullum


### Find the names of all players that got bowled out in the second innings.

In [16]:
second_innings_data=data['innings'][1]['2nd innings']['deliveries']
print(second_innings_data)

[{'0.1': {'batsman': 'R Dravid', 'bowler': 'AB Dinda', 'non_striker': 'W Jaffer', 'runs': {'batsman': 1, 'extras': 0, 'total': 1}}}, {'0.2': {'batsman': 'W Jaffer', 'bowler': 'AB Dinda', 'extras': {'wides': 1}, 'non_striker': 'R Dravid', 'runs': {'batsman': 0, 'extras': 1, 'total': 1}}}, {'0.3': {'batsman': 'W Jaffer', 'bowler': 'AB Dinda', 'non_striker': 'R Dravid', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.4': {'batsman': 'W Jaffer', 'bowler': 'AB Dinda', 'non_striker': 'R Dravid', 'runs': {'batsman': 1, 'extras': 0, 'total': 1}}}, {'0.5': {'batsman': 'R Dravid', 'bowler': 'AB Dinda', 'non_striker': 'W Jaffer', 'runs': {'batsman': 1, 'extras': 0, 'total': 1}}}, {'0.6': {'batsman': 'W Jaffer', 'bowler': 'AB Dinda', 'non_striker': 'R Dravid', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'0.7': {'batsman': 'W Jaffer', 'bowler': 'AB Dinda', 'non_striker': 'R Dravid', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {'1.1': {'batsman': 'R Dravid', 'bowler': 'I Sharma

In [39]:
bowled=[]
for out in second_innings_data:
    for delivery_number, delivery_info in out.items():
        if 'wicket' in delivery_info and delivery_info['wicket']['kind']=='bowled':
             bowled.append(delivery_info['wicket']['player_out'])
        
print(bowled)
        

['R Dravid', 'V Kohli', 'Z Khan']


### How many more "extras" (wides, legbyes, etc) were bowled in the second innings as compared to the first inning?

In [47]:
fst_extras, snd_extras=0, 0
for extras in first_innings_deliveries:
    for deliveries_number, delivery_info in extras.items():
           fst_extras+=delivery_info['runs']['extras']

for extras2 in second_innings_data:
    for deliveries_number, delivery_info in extras2.items():
           snd_extras+=delivery_info['runs']['extras']

In [46]:
print(snd_extras-fst_extras)

2
