Skip to content

IPL Data Analysis with PySpark used ipl dataset from 2008 - 2019

Notifications You must be signed in to change notification settings

puneetpushkar/IPL-Data-Analysis-with-PySpark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

IPL-Data-Analysis-with-PySpark

Questions that derived the whole anaysis

A. Find the top 3 venues which hosted the most number of eliminator matches?
B. Return most number of catches taken by a player in IPL history?
C. Write a query to return a report for highest wicket taker in matches which were affected by Duckworth-Lewis’s method (D/L method).
D. Write a query to return a report for highest strike rate by a batsman in non powerplay overs(7-20 overs)
E. Write a query to return a report for highest extra runs in a venue (stadium, city).
F. Write a query to return a report for the cricketers with the most number of players of the match award in neutral venues.
G. Write a query to get a list of top 10 players with the highest batting average Note: Batting average is the total number of runs scored divided by the number of times they have been out (Make sure to include run outs (on non-striker end) as valid out while calculating average).
H. Write a query to find out who has officiated (as an umpire) the most number of matches in IPL.
I. Find venue details of the match where V Kohli scored his highest individual runs in IPL.
J. How winning/ loosing tosses can impact a match and its result

About

IPL Data Analysis with PySpark used ipl dataset from 2008 - 2019

Topics

Resources

Stars

Watchers

Forks