# IPL Dataset Analysis

## Problem Statement
We want to know as to what happens during an IPL match which raises several questions in our mind with our limited knowledge about the game called cricket on which it is based. This analysis is done to know as which factors led one of the team to win and how does it matter.

## About the Dataset :
The Indian Premier League (IPL) is a professional T20 cricket league in India contested during April-May of every year by teams representing Indian cities. It is the most-attended cricket league in the world and ranks sixth among all the sports leagues. It has teams with players from around the world and is very competitive and entertaining with a lot of close matches between teams.

The IPL and other cricket related datasets are available at [cricsheet.org](https://cricsheet.org/%c2%a0(data). Feel free to visit the website and explore the data by yourself as exploring new sources of data is one of the interesting activities a data scientist gets to do.

### Analysing data with basic python operation

## Read the data of the format .yaml type

# **FOR Reference**
- YAML acroynym stands for “YAML Ain’t Markup Language,”.

- YAML is a human-readable data serialization standard that can be used in conjunction with all programming languages and is often used to write configuration files.

- YAML vs. JSON
YAML 1.2 is a superset of JavaScript Object Notation (JSON) but has some built-in advantages. For example, YAML can self-reference, support complex datatypes, embed block literals, support comments, and more. Overall, YAML tends to be more readable than JSON as well

In [None]:
import yaml

In [None]:
# using with open command to read the file
with open ('ipl_match.yaml') as f:
  data = yaml.load(f)

In [None]:
# print data
print (data)

{'meta': {'data_version': 0.9, 'created': datetime.date(2011, 5, 6), 'revision': 2}, 'info': {'city': 'Bangalore', 'competition': 'IPL', 'dates': [datetime.date(2008, 4, 18)], 'gender': 'male', 'match_type': 'T20', 'outcome': {'by': {'runs': 140}, 'winner': 'Kolkata Knight Riders'}, 'overs': 20, 'player_of_match': ['BB McCullum'], 'teams': ['Royal Challengers Bangalore', 'Kolkata Knight Riders'], 'toss': {'decision': 'field', 'winner': 'Royal Challengers Bangalore'}, 'umpires': ['Asad Rauf', 'RE Koertzen'], 'venue': 'M Chinnaswamy Stadium'}, 'innings': [{'1st innings': {'team': 'Kolkata Knight Riders', 'deliveries': [{0.1: {'batsman': 'SC Ganguly', 'bowler': 'P Kumar', 'extras': {'legbyes': 1}, 'non_striker': 'BB McCullum', 'runs': {'batsman': 0, 'extras': 1, 'total': 1}}}, {0.2: {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'non_striker': 'SC Ganguly', 'runs': {'batsman': 0, 'extras': 0, 'total': 0}}}, {0.3: {'batsman': 'BB McCullum', 'bowler': 'P Kumar', 'extras': {'wides': 1}, 'no

Now let's find answers to some prilminary questions such as 

### Can you guess the data type with which your working on ?

We are working with a nested dictionary

### In which city the match was played and where was it played ?

In [None]:
c=data['info']['city']
print("The City Where The Match Is Being Played:",c)


The City Where The Match Is Being Played: Bangalore
The Venue Where The Match Is Being Played: M Chinnaswamy Stadium


In [None]:
ven=data['info']['venue']
print("The Venue Where The Match Is Being Played:",ven)

The Venue Where The Match Is Being Played: M Chinnaswamy Stadium


### Which are all the teams that played in the tournament ? How many teams participated  in total?

In [None]:
t=data['info']['teams']
print("The Teams Participating: {} , {}".format(t[0],t[1]))
print("The Number Of Teams Participating:",len(t))

The Teams Participating: Royal Challengers Bangalore , Kolkata Knight Riders
The Number Of Teams Participating: 2


### Which team won the toss and what was the decision of toss winner ?

In [None]:
tw=data['info']['toss']['winner']
td=data['info']['toss']['decision']
print("The Team That Won The Toss:",tw)
print("The Decision Of The Toss Winner:",td)

The Team That Won The Toss: Royal Challengers Bangalore
The Decision Of The Toss Winner: field


### Find the first bowler who played the first ball of the first inning. Also the first batsman who faced first delivery ?

In [None]:
firstballer=data['innings'][0]['1st innings']['deliveries'][0][0.1]['bowler']
print("The Bowler, Bowling The First Ball Of The First Inning:",firstballer)
firstbatsman=data['innings'][0]['1st innings']['deliveries'][0][0.1]['batsman']
print("The Batsman Facing The First Delivery:",firstbatsman)

The Bowler, Bowling The First Ball Of The First Inning: P Kumar
The Batsman Facing The First Delivery: SC Ganguly


### How many deliveries were delivered in first inning ?

In [None]:
## Does this come out to be more than 120 if yes then why ?
nb=len(data['innings'][0]['1st innings']['deliveries'])
print('The Number Of Deliveries Delivered In The First Inning:',nb)
#Yes, This comes out to be more than 120 because of 4 extra deliveries conceded by the bowling team in the form of wides or no balls

124


### How many deliveries were delivered in second inning ?

In [None]:
## Does this come out to be less or greater than 120 then what's your thought process behind it ?
nb2=len(data['innings'][1]['2nd innings']['deliveries'])
print("The Number Of Deliveries Bowled In The Second Inning",nb2)
#This comes out to be less than 120, this is because RCB lost all of their wickets in 101 balls

101


### Which team won and how ?


In [None]:
## see if the guess of the students is right did they infer it correctly
winner=data['info']['outcome']['winner']
wonby=data['info']['outcome']['by']['runs']
print('{} won by {} runs'.format(winner,wonby))
#They won because they bowled out the RCB team before they could reach the target of runs set by KKR

Kolkata Knight Riders won by 140 runs
