Almost complete data dumps from the Boston Marathon, 2001-2014
Python
Latest commit 52bead8 May 5, 2014 @llimllib fix list formatting maybe
Permalink
Failed to load latest commit information.
images add violin ages plot Apr 29, 2014
maps
results add 2001 and 2002 Apr 27, 2014
.gitignore
LICENSE adding license file Apr 27, 2014
Makefile
NOTES
README.md whoops wrong image Apr 29, 2014
analysis.ipynb add country summary to analysis May 1, 2014
dl.py make archive URLs work Apr 26, 2014
index.md fix list formatting maybe May 5, 2014
makecsv.py sort csv by bib Apr 26, 2014
multidl.py disable gap detection in multidl Apr 26, 2014
requirements.txt add requirements.txt Apr 26, 2014

README.md

Boston Marathon Raw Data

This repository contains, as close as I can manage, all of the data on the Boston Marathon available from baa.org. It also contains a python notebook for exploration of that data.

The Data

Look in the results/{year}/results.csv files for the data. Do something interesting with it, and make sure you tell me about it!

Format

There are (unfortunately) two different data formats. 2013 and 2014 have more detailed timing data, with splits at 10k, 20k, 25k, half, 30k, 35k, and 40k.

Pre-2013, the data has only the finishing time, but adds the person's standing in their division, gender, and overall.

Caveats

  • The data includes wheelchair racers but not hand cyclists or other special groups... if you're interested in that data please submit a pull request!
  • The data does not include runners who did not finish. There's nothing I can do about that, as far as I can tell that data is unavailable from baa.org
  • The data is certainly missing a few people, but it ought to contain the large majority of runners who finished from each year.
  • The code is ugly. This is just about grinding the results out!

Visualizations

License

MIT License. Use it as you want to, don't feel obligated to give me credit. It's the BAA's data anyway. (Thanks for organizing, BAA)

Downloading The Data

I... already did that for you. Why do you want to do that?

Anyway, if you do, you'll want to run python multidl.py {year}

Viewing the notebook

  1. Install the prerequisites: pip install < requirements.txt
  2. Start the notebook: make notebook
  3. Play!