Skip to content

Tasks: Represent CSV Schema

James McKinney edited this page Nov 11, 2017 · 17 revisions

The Represent CSV Schema is a standardized CSV format for elected officials’ contact information.

The most expensive part of operating the Represent Civic Information API is maintaining the scrapers that collect this information from government websites.

By publishing this information in a standardized format at a stable URL, governments decrease the cost of operating this service.

Orientation

We track government adoption in the CSV Schema sheet of a private spreadsheet. Note: The Sheet1 sheet is used to generate open_data_canada, so don't change its columns.

The Name and Code columns should have the same values in the same order as Sheet1. Actions this year tracks any request activity in this year, and Actions prior years stores any from prior years.

At the start of each year, we move the values from Actions this year to Actions prior years, unless the status is Done. Note: Cut-and-paste removes conditional formatting. Instead, copy-and-paste and then delete the origin cell's contents.

1. Sending a request to a government

If a government has an open data catalog according to Sheet1, we can send a request to the contact email in Sheet1. Note that, if it is a federal, provincial or territorial government, the request must go to the legislature, not the government. No legislature in Canada has open data yet.

If Actions prior years is empty, send this template email with an info sheet (English, French) in attachment:

Hello,

Open North - a Canadian nonprofit working on open data and open government - is working with municipalities across Canada to publish their elected officials’ contact information in a standardized CSV format. Today, over 30 open data catalogs - including <nearby jurisdictions> - publish this dataset according to the format described at http://represent.opennorth.ca/government/ Could the <jurisdiction> adopt the format and publish it in its open data catalog? I’ve attached an info sheet with more information on the initiative and its benefits.

Thanks, and let me know if you have any questions.

<name>

Otherwise, find your prior correspondence and reply. You can use this template email:

Hi <name>,

In <month> last year, we discussed adding a new dataset for elected officials’ contact information, using a common CSV format (used by over 30 other data catalogs across Canada) described at http://represent.opennorth.ca/government/

Can you give me an update on the possibility of releasing this dataset?

Best,

<name>

Log your activity in Actions this year with a message like 2017: Sent Jan 15.

2. Receiving a response from a government

Governments may respond saying that they forwarded your request, will look into it, will prepare the dataset, etc. In these cases, log the activity with a message like 2017: Responded Jan 20.

If a government explicitly declines to adopt (this has only occurred once), ask whether there was any particular reason for not participating. Log the activity with a message like 2017: Declined Feb 1.

If a government responds with a CSV attachment, thank them and ask them when they expect the CSV to be available online in their data catalog. Log the activity with a message like 2017: CSV attached Jan 25.

If a government responds with a link to a CSV in a data catalog, log the activity as Done!

3. Importing the data into Represent

scrapers-ca and scrapers_ca_app

  1. Create a directory in scrapers-ca like ca_on_toronto (lowercase and underscored ISO 3166-2:CA code and Census geographic name). Add a _candidates suffix if it scrapes candidates. If the directory already exists, overwrite its contents.
  2. Copy the __init__.py and people.py files from any module whose people.py file subclasses CSVScraper.
  3. In __init__.py, update the class name† and the division_id, division_name, name, and url properties. Look up the Census geographic name in country-ca.csv to find the division_id and url. If the name matches multiple rows, determine which is the correct one; in most cases, a municipality will have an identifier with a type ID of csd. Open the url in a browser to check it's correct. If there is no url, find the jurisdiction's website, like http://www.toronto.ca. Remove the get_organizations method; unless the jurisdiction has more than two roles, has multiple members per subdivision, or has at-large members and is not in NS, QC or BC, CSVScraper will create the correct organization and posts. In other cases, you need to write the get_organizations method, based by the council page at the url.
  4. In people.py, update the class name† and csv_url. For the rest, read the documentation of CSVScraper in utils.py, and read other people.py files that subclass CSVScraper.
  5. Run pupa update ca_xx_xxxxxxxx. Modify the module until it succeeds; however, some CSV errors can't be recovered from. Report any CSV errors to the government. You may also need to add styles of address for the jurisdiction; use the styles of address used by the jurisdiction, while trying to remain consistent with those of other jurisdictions.
  6. Push your changes, then update the scrapers submodule in scrapers_ca_app and deploy: git commit scrapers -m 'update scrapers' && git push && git push heroku
  7. Run the scraper on Heroku: heroku run ./manage.py update ca_xx_xxxxxxxx

† Add a Candidates suffix to the class name if it scrapes candidates.

represent-canada

  1. Create a representative set in the Django admin if it doesn't already exist and run the "Update from data source" action on the representative set.
  2. If the representative set lacks its own unique boundary set because its jurisdiction has no subdivisions, then the CSV Downloads page will not be able to determine which province it belongs to, and will list it under "Uncategorized". To correct this, assign a province to the representative set in uncategorized_map in data.js.
  3. If a representative set introduces a new elected_office, it will have to be added to order in demo.js.

4. Recognizing the government

After successfully importing the data into Represent, recognize the government for adopting the standard. You can celebrate the adoption with a tweet. Most importantly, add the adopter's logo to page for the Represent CSV Schema.

In your browser:

  • Search <municipality> logo. If different municipalities have the same name, add the province. If the results are irrelevant, add a descriptor like city or town as appropriate.
  • Click Tools, Color, Black and white to check if there are any black-and-white logos. If not, reset Color to Any color.
  • Download a high-resolution logo, preferably in PNG format.

In an image editor:

  • Change the image to grayscale if it is not black-and-white.
  • Trim the borders. The height should be 100 pixels or more. If not, find another image.
  • Resize the height to 100 pixels.
  • If the width is an odd number, widen the canvas by 1 pixel.
  • Save the image as PNG-24 to finder/static/img/logos/<municipality>.png

In a text editor:

  • Edit finder/templates/government.html and add a new img tag to the end

In a shell:

git add finder/static/img/logos
git commit finder/static/img/logos finder/templates/government.html -m 'Add logos'
git push
fab ohoh deploy