A simple notation and stat-tracking system for tennis.
This script reads a point-by-point input file from a set or match and outputs the following for each player:
-
winners
-
unforced errors
-
double faults
-
aces
-
first serve percentage
-
percentage of points won after good first serve
More stats to come. The data file syntax is most quickly explained with an annotated example:
DATE: 2009-07-08 # date of the match (may be more than one per file) PLAYER: R, Roddick # one-letter abbreviation (unique for match) and full player name PLAYER: G, Gasquet SET: 1 # beginning of a set (colon and number are optional) GAME: R # beginning of a game and abbreviation of player serving (here: Roddick) 1R # point #1: 1 fault, Roddick wins point 1R # point #2: 1 fault, 0RA # point #3: no faults, Roddick wins point on an ace ("A") 2G # point #4: 2 faults, Gasquet wins point 0G 0GU # point #6: no faults, Gasquet wins point on unforced error ("U") 2G 1GU GAME: G # beginning of new game: Gasquet serving 0RW # point #1: no faults, Roddick wins point on a winner ("W") 0RU 0G 0GW 0G 0GU ...
The un-labeled lines with two or three characters are the points. The first character is the number 0, 1, or 2 which indicates the number of service faults. The second character is the abbreviation of the player who won the point, and the third character is an optional description of how the point was won:
A : ace U : unforced error W : winner
The default is a “forced error” which is how most points are decided. A tie breaker should be entered just like a regular game (server should be the player that serves first).
This is a stand-alone script with no formal installation procedure. Just clone the repository and run:
ruby report.rb [datafile]
The unit tests use Ruby’s included Test::Unit and should be run like so:
rake test
-
handle tiebreakers properly
-
score checking
-
does opponent win point on double fault?
-
do game final scores make sense?
-
-
report output should use column headers (don’t repeat player names)
-
support for multiple matches/files
-
syntax checking (error messages)
-
more stats
-
command-line real-time entry mode (generates data file)