Skip to content

Yrzxiong/data-analysis-for-2016-baseball-games

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data-analysis-for-baseball-games

This almost like my first project? I learned a lot during doing it. Hope I can find better way to finish it.

Target:

  1. Load in the appropriate csv file as a pandas dataframe (batting.csv)
  2. Print out the dimensions and info about the dataframe you just created
  3. How many players have hit 40 or more HRs in one single season? (Number only)
  4. How many players have hit more than 600 HRs for their career? (Dataframe)
  5. How many players have hit 40 2Bs, 10 3Bs, 200 Hits, and 30 HRs (inclusive) in one season? (Number Only)
  6. How many players have had 100 or more SBs in a season? (Dataframe)
  7. How many players in the 1960s have hit more than 200 HRs? (Dataframe)
  8. Who has hit the most HRs in history? (Dataframe)
  9. Who had the most hits in the 1970s? (Dataframe)
  10. Top 5 highest OBP (on base percentage) with at least 500 PAs in 1977? (Dataframe)
  11. Top 8 highest averages in 2013 with at least 300 PAs? (Dataframe)
  12. Leaders in hits from 1940 up to and including 1949. (Dataframe)
  13. Who led MLB with the most hits the most times? And how many times? (Dataframe, Number)
  14. Which players have played the most games for their careers? Top 5, descending by games played presented as a dataframe
  15. How many players have had more than 3000 hits for their careers while also hitting 500 or more HRs? Just a number is okay here
  16. How many HRs were hit during the entire 1988 season? Just a number is okay here
  17. Please filter out and show me the top 3 average seasons by Wade Boggs during his career in seasons in which he had at least 500 ABs. I would like a dataframe sorted by average.
  18. Please filter out the top OBPs for the 1995 season with at least 400 PAs, sorted by OBP. I would like a dataframe for this
  19. Who had the most 3Bs (in total) in 1922, 1925, 1926, and 1928? I would like a dataframe with just the leader
  20. How many players have hit 30 or more HRs in season while also stealing (SB) 30 more or bases? A number is okay here
  21. Who had the highest OBP is 1986 with at least 400 PAs? (Dataframe)
  22. Same question but for 1997 and only in the NL (check league ID)? (Dataframe)
  23. Who had more than the league average HRs in 2012 (filter out all players with less 500 PAs)? (Dataframe)
  24. Who is the youngest player to hit 50 or more HRs in a single season? (Dataframe)
  25. Who are the five youngest players to hit 300 or more HRs for their career? (Dataframe) BONUS: Graph total HRs per season using bar graph Using a line graph please graph the average HRs per AB (think about this) per season

Resources:

Dataset:

The Lahman Baseball Database 2016 Version We will use the batting.csv file.

You can find it online: http://www.seanlahman.com/baseball-archive/statistics/ 2016 – comma-delimited version – Updated February 26, 2017 Or just click this link: http://seanlahman.com/files/database/baseballdatabank-2017.1.zip

Releases

No releases published

Packages

No packages published