A Rails API to serve as the backend for this dashboard: Taxi and Ridehailing Usage in Chicago
The app queries and stores data from two relevant datasets on Chicago's open data portal:
Taxi data is available since 1/1/2013, TNP data since 11/1/2018. As of 2019 there are three licensed TNPs in Chicago: Uber, Lyft, and Via. The TNP dataset does not identify which company provided each trip
The datasets are both made up of individual trips. The app executes Socrata SoQL queries against the raw datasets to produce monthly summaries, then stores those monthly summaries in the chicago_monthly_reports
table. In theory, the dashboard could connect directly to the Socrata open data portal and run the queries on every page load, but that would be wildly impractical as it takes a few hours of query time to populate the full historical summaries
Most of the relevant code lives in app/models/chicago_monthly_report.rb
Prerequisites: Ruby and PostgreSQL
Run the following commands to create the database and populate it with all months of available data:
bundle exec rake db:setup
bundle exec rake jobs:work
Note that the initial database backfill will take several hours. You can tinker with db/seeds.rb
before running the setup commands to, e.g., populate fewer months historically
clock.rb
is configured to check the portal once per day and, if there are new months available, run the relevant queries to populate the database. You'll need to run both the clock and worker processes, e.g. on Heroku that would be one clock
dyno and one worker
dyno
- chicago-taxi-data repo: similar to this dashboard repo, but instead of populating a table of monthly summaries, populates a local PostgreSQL database with all individual taxi and TNP trip records
- nyc-taxi-data repo: download and import all of the publicly available NYC taxi and for-hire vehicle trip records
- Taxi and Ridehailing Usage in New York City dashboard
todd@toddwschneider.com, or open a GitHub issue