# IPL Dataset Analysis

## Problem Statement
We want to know as to what happens during an IPL match which raises several questions in our mind with our limited knowledge about the game called cricket on which it is based. This analysis is done to know as which factors led one of the team to win and how does it matter.

## About the Dataset :
The Indian Premier League (IPL) is a professional T20 cricket league in India contested during April-May of every year by teams representing Indian cities. It is the most-attended cricket league in the world and ranks sixth among all the sports leagues. It has teams with players from around the world and is very competitive and entertaining with a lot of close matches between teams.

The IPL and other cricket related datasets are available at [cricsheet.org](https://cricsheet.org/%c2%a0(data). Feel free to visit the website and explore the data by yourself as exploring new sources of data is one of the interesting activities a data scientist gets to do.

### Analysing data with advanced python operations

In [3]:
import json

In [4]:
with open('./data/ipl_match.json') as f:
    data = json.load(f)

In [5]:
data

{'meta': {'data_version': 0.9, 'created': '2011-05-06', 'revision': 2},
 'info': {'city': 'Bangalore',
  'competition': 'IPL',
  'dates': ['2008-04-18'],
  'gender': 'male',
  'match_type': 'T20',
  'outcome': {'by': {'runs': 140}, 'winner': 'Kolkata Knight Riders'},
  'overs': 20,
  'player_of_match': ['BB McCullum'],
  'teams': ['Royal Challengers Bangalore', 'Kolkata Knight Riders'],
  'toss': {'decision': 'field', 'winner': 'Royal Challengers Bangalore'},
  'umpires': ['Asad Rauf', 'RE Koertzen'],
  'venue': 'M Chinnaswamy Stadium'},
 'innings': [{'1st innings': {'team': 'Kolkata Knight Riders',
    'deliveries': [{'0.1': {'batsman': 'SC Ganguly',
       'bowler': 'P Kumar',
       'extras': {'legbyes': 1},
       'non_striker': 'BB McCullum',
       'runs': {'batsman': 0, 'extras': 1, 'total': 1}}},
     {'0.2': {'batsman': 'BB McCullum',
       'bowler': 'P Kumar',
       'non_striker': 'SC Ganguly',
       'runs': {'batsman': 0, 'extras': 0, 'total': 0}}},
     {'0.3': {'batsman

We can further deep dive into this data to find out more information about batsman and bowlers

### Can you find how many deliveries were faced by batsman  `SC Ganguly`.

In [6]:
def get_no_of_deliveries(innings,innings_type):
    return data.get('innings')[innings].get(innings_type).get('deliveries')

def total_balls_played(deliveries,name_of_batsman):
    del_val = [list(delivery.values()) for delivery in deliveries]
    return len(list(filter(lambda x: x[0].get('batsman')==name_of_batsman,del_val)))

def no_of_balls_faced_batsman(name_of_batsman,innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    return total_balls_played(get_no_of_deliveries(innings,innings_type),name_of_batsman)

In [7]:
no_of_balls_faced_batsman('Mohammad Hafeez',0) 

3

### Who was man of the match and how many runs did he scored ?

In [14]:
from functools import reduce

def get_runs_info(delivery_info,name_of_batsman):
    if delivery_info[0].get('batsman')==name_of_batsman and delivery_info[0].get('runs').get('extras') == 0:
        return delivery_info[0].get('runs').get('total')
    else:
        return 0

def get_total_no_runs_by_batsman(deliveries,name_of_batsman):
    del_val = [list(delivery.values()) for delivery in deliveries]
    return list(map(lambda delivery_info:get_runs_info(delivery_info,name_of_batsman),del_val))
    
def runs_scored_by_batsman(name_of_batsman,innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    list_runs = get_total_no_runs_by_batsman(get_no_of_deliveries(innings,innings_type),name_of_batsman)
    return reduce(lambda current_score,next_ball_score: current_score + next_ball_score,list_runs)


In [15]:
player_of_match = data.get('info').get('player_of_match')
run_scored_by_player = runs_scored_by_batsman(player_of_match[0],0)

print("Player of the match is {} and total runs scored {}".format(player_of_match[0],run_scored_by_player))

Player of the match is BB McCullum and total runs scored 158


### Which batsman played in the first inning?

In [41]:
def list_of_batsman_in_first_inning(deliveries):
    del_val = [list(delivery.values()) for delivery in deliveries]
    return set(list(map(lambda dinfo: dinfo[0].get('batsman'),del_val)))

def all_batsman_from_inning(innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    return list_of_batsman_in_first_inning(get_no_of_deliveries(innings,innings_type))

In [42]:
all_batsman_from_inning(0)

{'BB McCullum', 'DJ Hussey', 'Mohammad Hafeez', 'RT Ponting', 'SC Ganguly'}

### Which batsman had the most no. of sixes in first inning ?

In [78]:
def batsman_with_six(dinfo):
    if dinfo[0].get('runs').get('total') == 6:
        return dinfo[0].get('batsman')
    
def get_batsman_with_sixes(deliveries):
    del_val = [list(delivery.values()) for delivery in deliveries]
    batsman_with_sixes = list(map(lambda dinfo:batsman_with_six(dinfo),del_val))
    return list(filter(lambda tup_value:tup_value!=None,batsman_with_sixes))
    
def get_six_count_for_each_batsman_in_innings(list_of_batsman_with_sixes,innings):
    batsman_six_count = []
    for batsman in list(all_batsman_from_inning(innings)):
        batsman_six_count.append({batsman:len(list(filter(lambda b:b==batsman,list_of_batsman_with_sixes)))})
    return batsman_six_count

def all_sixes_from_inning(innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    list_of_batsman_with_sixes = get_batsman_with_sixes(get_no_of_deliveries(innings,innings_type))
    dict_of_sixes = get_six_count_for_each_batsman_in_innings(list_of_batsman_with_sixes,innings)
    return dict_of_sixes

In [80]:
all_sixes_from_inning(0)

[{'SC Ganguly': 0},
 {'BB McCullum': 13},
 {'Mohammad Hafeez': 0},
 {'DJ Hussey': 0},
 {'RT Ponting': 1}]

### Find the names of all players that got bowled out in the second innings.

In [100]:
def get_all_wicket_deliveries(dinfo_ls):
    return list(filter(lambda dinfo: dinfo[0].get('wicket')!=None,dinfo_ls))

def get_player_name_based_on_dissmisal(dinfo_obj,dismissal_type):
    if dinfo_obj.get('wicket').get('kind')==dismissal_type:
        return dinfo_obj.get('batsman')
    
def filter_dissmisal_from_innings(dismissal_type,deliveries):
    del_val = [list(delivery.values()) for delivery in deliveries]
    list_of_wicket_deliveries = get_all_wicket_deliveries(del_val)
    plyr_ls = map(lambda dinfo:get_player_name_based_on_dissmisal(dinfo[0],dismissal_type),list_of_wicket_deliveries)
    return list(filter(lambda player: player != None,list(plyr_ls)))
    
def list_players_by_thier_dismissal_type(dismissal_type,innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    return filter_dissmisal_from_innings(dismissal_type,get_no_of_deliveries(innings,innings_type))

In [101]:
list_players_by_thier_dismissal_type('bowled',1)

['R Dravid', 'V Kohli', 'Z Khan']

### How many more "extras" (wides, legbyes, etc) were bowled in the second innings as compared to the first inning?

In [102]:
def get_extras_from_deliveries(dinfo_obj):
    if dinfo_obj.get('extras') != None:
        return dinfo_obj
    
def filter_from_deliveries(deliveries):
    del_val = [list(delivery.values()) for delivery in deliveries]
    get_extras = filter(lambda dinfo: get_extras_from_deliveries(dinfo[0]),del_val)
    return len(list(get_extras))
    
def get_count_of_extras_from_innnings(innings):
    if innings == 0:
        innings_type = '1st innings'
    else:
        innings_type = '2nd innings'
    return filter_from_deliveries(get_no_of_deliveries(innings,innings_type))

In [103]:
count_of_extras_from_first_innings = get_count_of_extras_from_innnings(0)
count_of_extras_from_second_innings = get_count_of_extras_from_innnings(1)

print("Total Extras bowled in First Innings : ",count_of_extras_from_first_innings)
print("Total Extras bowled in Second Innings : ",count_of_extras_from_second_innings)

print(count_of_extras_from_second_innings - count_of_extras_from_first_innings)

Total Extras bowled in First Innings :  9
Total Extras bowled in Second Innings :  15
6
