Paris-Roubaix Data Sets
These data files were gathered and compiled in several steps:
First, all of the relevant html pages are downloaded from their respective sources, to avoid overwhelming servers with requests.
The html files are scraped by
paris-roubaix-scraper-pcs, which output
These two files are combined by
race-data-mash.js, which performs name lookups to improve formatting, matches countries to racers when available, calculates speed and compiles into
parisRoubaix-fullv2.json. In a few cases, speed and time are estimated based on rank due to incomplete records, in which case the attribute
est is marked
byRider.json, an associative array of every racer known to have participated in Paris-Roubaix, the years they participated, their rank and the points they would have gotten under the current UCI ranking system (http://www.uci.ch/mm/Document/News/Rulesandregulation/17/73/59/2-ROA-20161108-E_English.PDF Page 60), tracked individually and cumulatively. The top 10% of riders under this scoring system are output to
None of these data sets would exist without the incredible and complete resources on which they are based at ProCyclingStats.com, BikeRaceInfo.com, and www.letour.com/paris-roubaix/2016/us/.