We're working with a largeish dataset. It's comprised of all the Toronto bike share trips from 2017 and 2018, from here. This repo contains the data dump itself, some scripts that load it into a postgres database (using Docker Compose), and a set of exercises around it.
You will need Docker / Docker Compose for this to work
- For MacOS, you can install Docker Desktop for Mac or do
brew cask install docker
if you use Homebrew - For Windows, you can install Docker Desktop for Windows
- For Linux, you can install via your package manager
The first thing you're going to want to do is iniitalize the database
./bin/init
This takes about 10 minutes to run, so do it first!
We can access the database in one of two ways, we can either use [http://sosedoff.github.io/pgweb/] or access it via a console.
For pgweb try
./bin/pgweb
Note if that doesn't work you can just run
docker-compose up -d
and open http://localhost:8081 in your browser!
For console try
./bin/psql
- Take good notes on the queries you're running and what you're seeing!
- Don't forget to add
LIMIT 10
to your exploratory queries so you don't overload the database. Take it off when you're ready to go! - If you want to make a backup, try
CREATE TABLE trips_backup AS
TABLE trips;
If you want to start over at any point you can run the ./bin/init
script at any point (remember that it will take a while).
If you really mess up and want to completely get rid of everything and start over, you can
docker-compose down
docker-compose rm
docker volume rm toronto-bikeshare-data_db-data