How it works
To place the images on a map ("geocode" them), we use a list of Toronto street names and a collection of regular expressions which look for addresses and cross-streets. We send these through the Google Maps Geocoding API to get latitudes and longitudes for the images. We also incorporate a set of points of interest for popular locations like the CN Tower or City Hall.
- Live site: https://oldtoronto.sidewalklabs.com
Setup dependencies (on a Mac):
brew install coreutils csvkit
OldTO requires Python 3.6. Once you have this set up, you can install the Python dependencies (possibly in a virtual environment) via:
pip install -r requirements.txt
Building the site
The OldTO site lives in
oldto-site. In order to build it, you'll need the
yarn package manager. Instructions on setting that up at https://yarnpkg.com/.
You'll also need to get a Google Maps API key. Once you've done this,
set the enviroment variable
GMAPS_API_KEY to your own api key:
Webpack needs this to build the site when you run
yarn webpack. You can
spin it up by running it locally using
http-server (install with
npm install -g http-server).
Then visit http://localhost:8081/ to browse the site.
To iterate on the site, use
cd oldto-site yarn watch & cd dist http-server --proxy=https://api.sidewalklabs.com
Generating new geocodes
First, add your Google Maps API key to the file
Next, you'll first want to download cached geocodes from here.
Unzip this file into
cache/maps.googleapis.com. This will make the geocoding
pipeline run faster and more consistently than geocoding from scratch.
With this in place, you can update
images.geojson by running:
Note, to run the makefile on an OSX machine you will probably want to install md5sum, which can be done by running:
brew update && brew install md5sha1sum
Serving new geocodes
If you've generated new geocodes, you'll need to run your own API server to serve them. You can do this by running:
oldtoronto/devserver.py data/images.geojson & cd oldto-site/dist http-server --proxy=http://localhost:8081
If you've generated geocodes in a different location, change
data/images.geojson to that.
Analyzing results and changes
Before sending out a PR with geocoding changes, you'll want to run a diff to evaluate the change.
For a quick check, you can operate on a 5% sample and diff that against
oldtoronto/geocode.py --sample 0.05 --output /tmp/geocode_results.new.5pct.json oldtoronto/diff_geocodes.py --sample 0.05 /tmp/geocode_results.new.5pct.json
To calculate metrics using truth data (must have jq installed):
grep -E "$(jq '.features | .id' data/truth.gtjson | sed s/\"//g | paste -s -d '|' )" data/images.ndjson > data/test.images.ndjson oldtoronto/geocode.py --input data/test.images.ndjson oldtoronto/generate_geojson.py --geocode_results data/test.images.ndjson --output data/test.images.geojson oldtoronto/calculate_metrics.py --truth_data data/truth.gtjson --computed_data data/test.images.geojson
To debug a specific image ID, run something like:
oldtoronto/geocode.py --ids 520805 --output /tmp/geocode.json && \ cat oldtoronto/geocode.py.log | grep -v regex
If you want to understand the differences between two
images.geojson files, you can
diff_geojson.py script. This file will create a series of
showing differences between an A and B GeoJSON. This is useful for using with the
data collected to the corrections google forms. Use those along with the
Once you're ready to send the PR, run a diff on the full geocodes.
Update street names
To update the list of street names, run:
oldtoronto/extract_noun_phrases.py streets 1 > /tmp/streets+examples.txt && \ cut -f2 /tmp/streets+examples.txt | sed 1d | sort > data/streets.txt