Represent API: Data
Boundary files are under boundaries/. Most are stored in a directory tree matching Open Civic Data Division Identifiers (OCD-ID) starting at boundaries/ocd-division/. Federal, provincial and territorial boundary files are further scoped by redistribution year.
A few boundary files exist outside the OCD-ID tree. Some, like
ca_csd, are Census geography files whose OCD-ID would clash with Canada's. Others are the sources of multiple boundary sets in the API, each with a different OCD-ID.
Open North has permission to redistribute all shapefiles in this repository. Please read the overall license and the
LICENSE.txt file in each directory to know your rights. In some cases, you will not have permission to redistribute the shapefile.
Open North lacks permission to redistribute the shapefiles of some boundary sets in the API. Refer to the
data_url of those boundary sets to get copies of those shapefiles.
All datasets are from government sources, with one exception: the postal codeOM dataset in the
postcodes/fed directory is from Geocoder.ca. The
definition.py files will have more details on sources and any modifications made to the files. Postal CodeOM is an official mark of Canada Post Corporation.
# Invoke must not be installed globally. pip uninstall invoke # Create a virtual environment. mkvirtualenv representdata # Install the requirements. pip install -r requirements.txt flake8 npm install -g esri-dump
For all the following commands, add
--base=path/to/private/data to run them on the private repository.
Load the virtual environment:
List the available maintenance tasks:
Maintain definition files
Make the code style consistent:
Check that all
definition.py files are valid:
Check that all data directories contain a
LICENSE.txt (don't run on the private repository):
Check that the source, data and license URLs work:
Find and correct the URLs in
definition.py files. If you update a
licence_url, you may need to update other occurrences in
tasks.py and this master spreadsheet. Once all corrections are made, re-run
If you update a
data_url, update its shapefile,
id_func following the instructions below.
Check for old boundaries that may require manual updates:
Update a specific out-of-date shapefile. This task updates the
last_updated date in the
invoke shapefiles --base=boundaries/ocd-division/country:ca/province:qc/2011
Or, update all out-of-date shapefiles. The output may contain additional instructions:
Some shapefiles are online but require exceptional processing. Remember to update
esri-dump http://geonb.snb.ca/ArcGIS/rest/services/GeoNB_ENB_MunicipalWards/MapServer/0 > boundaries/ca_nb_wards/wards.geojson ogr2ogr -f "ESRI Shapefile" boundaries/ca_nb_wards boundaries/ca_nb_wards/wards.geojson
After receiving a new boundary file for all municipalities in Quebec, you need to update the
definition.py file in
- Copy the output into the appropriate section of
- Comment out jurisdictions for which other sources have more complete data (Dorval, Kirkland)
- Separately define the boundaries of jurisdictions whose names duplicate others' (Plessisville (32045))
- After loading the boundaries into Represent, check La Tuque and Sept-Îles in particular
Get information about the shapefile:
ogrinfo -al -geom=NO boundaries/ocd-division/country:ca/province:qc/2011
Determine the attribute for the feature's name and, if it exists, the attribute for the feature's public identifier.
For features that are numbered like "Ward 1", if there is no attribute for the numeric identifier, we can extract it from the name, like
id_func=lambda f: re.sub(r'\D', '', f.get('WARD')). Similarly, if there is no attribute for the name, we can build it from the numeric identifier, like
name_func=lambda f: 'Ward %s' % f.get('WARD').
For features that aren't numbered like "Ward 1", determining the public identifier may be tricky: the ID should be discoverable online; no two features should have the same ID; and
OBJECTID is never the ID.
Read this section of the example
definition.py file for help writing a
If you're updating many shapefiles, it may be long to run
ogrinfo on each. Instead, run
../represent-canada/manage.py analyzeshapefiles -d . > manifest and
git diff manifest instead.
Fix file permissions:
Check if the data request process spreadsheet is out-of-date:
Or less verbose:
invoke spreadsheet --base=. --private-base=../represent-canada-private-data > /dev/null
Each data directory under concordances/ has a README explaining how to source and update its concordances. If the concordances are more than a year old and can't be sourced, they should be removed. To do so, substitute the corresponding values in the above READMEs for
fab ohoh update_concordances:args="<slug> <source> data/shapefiles/public/concordances/empty.csv"
Each data directory under postcodes/ has a README explaining how to source and update its postcodes.
We would like to express our gratitude to Kent Mewhort at the Canadian Internet Policy and Public Interest Clinic (CIPPIC), whose legal research (PDF) made it possible for this repository to be made public.