Point-by-point data for Grand Slams, 2011-current
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitattributes 🎉 Added .gitattributes & .gitignore files Mar 8, 2015
.gitignore 🎉 Added .gitattributes & .gitignore files Mar 8, 2015
2011-ausopen-matches.csv
2011-ausopen-points.csv Initial data load Mar 10, 2015
2011-frenchopen-matches.csv
2011-frenchopen-points.csv
2011-usopen-matches.csv
2011-usopen-points.csv
2011-wimbledon-matches.csv Initial data load Mar 10, 2015
2011-wimbledon-points.csv
2012-ausopen-matches.csv
2012-ausopen-points.csv Initial data load Mar 10, 2015
2012-frenchopen-matches.csv
2012-frenchopen-points.csv
2012-usopen-matches.csv Initial data load Mar 10, 2015
2012-usopen-points.csv
2012-wimbledon-matches.csv Initial data load Mar 10, 2015
2012-wimbledon-points.csv Initial data load Mar 10, 2015
2013-ausopen-matches.csv
2013-ausopen-points.csv Initial data load Mar 10, 2015
2013-frenchopen-matches.csv
2013-frenchopen-points.csv Initial data load Mar 10, 2015
2013-usopen-matches.csv Initial data load Mar 10, 2015
2013-usopen-points.csv
2013-wimbledon-matches.csv Initial data load Mar 10, 2015
2013-wimbledon-points.csv
2014-ausopen-matches.csv
2014-ausopen-points.csv Initial data load Mar 10, 2015
2014-frenchopen-matches.csv
2014-frenchopen-points.csv Initial data load Mar 10, 2015
2014-usopen-matches.csv
2014-usopen-points.csv Initial data load Mar 10, 2015
2014-wimbledon-matches.csv Initial data load Mar 10, 2015
2014-wimbledon-points.csv
2015-ausopen-matches.csv
2015-ausopen-points.csv
2015-frenchopen-matches.csv
2015-frenchopen-points.csv 2015 slams Jan 11, 2016
2015-usopen-matches.csv 2015 slams Jan 11, 2016
2015-usopen-points.csv 2015 slams Jan 11, 2016
2015-wimbledon-matches.csv 2015 slams Jan 11, 2016
2015-wimbledon-points.csv 2015 slams Jan 11, 2016
2016-ausopen-matches.csv
2016-ausopen-points.csv 2016 AO Feb 2, 2016
2016-frenchopen-matches.csv
2016-frenchopen-points.csv
2016-usopen-matches.csv
2016-usopen-points.csv 2016 USO + 2017 AO Feb 8, 2017
2016-wimbledon-matches.csv
2016-wimbledon-points.csv
2017-ausopen-matches.csv
2017-ausopen-points.csv
2017-frenchopen-matches.csv
2017-frenchopen-points.csv 2017 RG + bugfix Jun 13, 2017
2017-usopen-matches.csv
2017-usopen-points.csv
2017-wimbledon-matches.csv 2017 Wimb + USO Sep 14, 2017
2017-wimbledon-points.csv 2017 Wimb + USO Sep 14, 2017
2018-usopen-matches.csv '18 us open Sep 10, 2018
2018-usopen-points.csv
2018-wimbledon-matches.csv 2018 wimbledon Aug 3, 2018
2018-wimbledon-points.csv
README.md note re: 2018 AO and RG Aug 3, 2018
data_dictionary.txt

README.md

Grand Slam Point-by-Point Data, 2011-18

This repo contains point-by-point data for most[1] main-draw singles Grand Slam matches since 2011. It was scraped from the four Grand Slam websites shortly after each event.

There are two files for each tournament. "-matches.csv" contain metadata for all the matches included from the tournament, and '-points.csv' contains all the available data for each point.

Unfortunately, much of the most useful data isn't available for every tournament. (For instance, there is no first/second serve indicator for many events, and rally length isn't included after the first few.) Much of the metadata isn't available for the last few years of tournaments, and some point-level data (such as winner type) isn't represented the same way throughout the whole dataset.

[Update, Feb 2017: Rally length came back with the 2016 French. Also new in 2016 was the gradual introduction of distance run stats.]

Still, there's a lot that can be done with this[2], especially since point-by-point tennis data is not readily available.

I'll try to keep this updated after each tournament, but I can't make any promises as to punctuality.

Note: This data is not available for the 2018 Australian Open or 2018 French Open. Some similar data is available for the 2018 AO, and at some point I may assemble that into a format as close to the other majors as possible.

License

Creative Commons License
Tennis databases, files, and algorithms by Jeff Sackmann / Tennis Abstract is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/JeffSackmann.

In other words: Attribution is required. Non-commercial use only.


[1] In general, this data is available for matches on courts with the Hawkeye system installed. The vast majority of missing matches are first-rounders.

[2] For instance, http://heavytopspin.com/2011/09/16/win-probability-graphs-and-stats/ http://heavytopspin.com/2011/08/07/do-points-get-shorter-as-the-match-progresses/ http://heavytopspin.com/2011/06/06/fun-with-french-open-rally-length/