Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 966 Bytes

README.md

File metadata and controls

10 lines (6 loc) · 966 Bytes

NBA_scraping-R

by Nicholas Archambault

For a class project, I was tasked with scraping NBA results data from Goldsheet.com for the 2016 season, an assignment meant to demonstrate the trickiness of messy, real-world data and the headaches of formatting and other idiosyncrasies that can arise.

This script builds upon that initial project by automating the scraping of 24 NBA seasons, between 1993 and 2016, from Goldsheet.com. Challenges in creating this script included navigating structural changes and omissions that varied between individual years of data entry.

The final result is ~100 lines of powerful code that correct for all anomalies and edge cases in the data, providing game and gambling results from over 59,000 NBA games, a massive trove of data that can be used to statistically analyze questions surrounding the accuracy of odds and point totals and the value of home-court advantage.

Comments and delineation of process found within script.