Skip to content

mhamel12/RetrosheetUmpires

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

RetrosheetUmpires

Extract umpire information from Retrosheet Event and Box Score Event files

Tested with Python 3.10.8 on Windows.

These files are licensed by a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license: https://creativecommons.org/licenses/by-nc/4.0/

References:

https://www.retrosheet.org/game.htm (to download Event Files and Box Score Event Files, ballparks.zip, and teams.zip)

https://www.retrosheet.org/eventfile.htm (explanation of the Event File format)

https://www.retrosheet.org/biofile.htm (to download biofile.zip)

Requirements:

  1. Download ballparks.zip, biofile.zip, and teams.zip from retrosheet.org; unzip all of these into a subfolder named "ids".

  2. Download and unzip one or more Event Files (.evx) into a subfolder named "evx".

    AND/OR

  3. Download and unzip one or more Box Score Event Files (.ebx) into a subfolder named "ebx".

At least one .evx or .ebx file is required.

Note that some games are not included in the Event Files due to lack of information. All games prior to 1950 are included in Box Score Event Files even if the game is also included in an Event File; this script assumes that the information in the Event Files is more complete/accurate, and uses that information as the primary data source for each game.

Example output files for the regular season from 1900-1979 are included, based on data files downloaded from Retrosheet on December 20, 2023. The data files were released by Retrosheet as part of their "Fall 2023 Release" on December 6, 2023: https://www.retrosheet.org/fall2023release.html (Note that this release remapped the "CLE" team abbreviation in teams.csv from Cleveland Indians to Cleveland Guardians. I did not update umpires.py to compensate for this, so the Guardians name now appears in the generated .csv files for all years from 1901 to the present. This is similar to how the Boston Red Sox name is used for the early 1900's. Maybe I will fix these problems in a future release.)

Additional examples for All-Star Games and Postseason games are also included. These include Negro League games.

  • The .csv files were generated by umpires.py
  • The .xlsx files were manually generated using Microsoft Excel, using the corresponding .csv files as a starting point.

Copyright notice for the Retrosheet data used to generate the .csv files:

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at 20 Sunset Rd., Newark, DE 19711.

About

Extract umpire information from Retrosheet Event and Box Score Event files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages