Skip to content
Scripts to download and analyze MLB game data
Python HTML
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
templates
test
.gitignore
LICENSE
README.md
app.py
config.py
db.py
gameloader.py
load.py
models.py
requirements.txt
utils.py

README.md

breakingball

Scripts to download and analyze MLB game data.

Setting Up the Database

Create the PostgreSQL database:

createdb breakingball

Create the database tables:

./db.py init

To drop all application tables and recreate them from scratch (this will destory all previously loaded data):

./db.py reset

Downloading Game Data

Start celery workers (with optional concurrency)

celery -A gameloader worker --loglevel=info --concurrency=4

load.py will first scrape game listings from a given date (or range of dates) and then delegate the download and extraction process to celery workers to be performed asynchronously. To load game data from a single date:

./load.py --start-date 2015-05-13

To load game data from a date range:

./load.py --start-date 2015-05-13 --end-date 2015-05-20

By default, the loader will skip games that exist in the database with a status of 'final'. To force a download and refresh of all game data, including those marked as final:

./load.py --start-date 2015-05-13 --refresh
You can’t perform that action at this time.