Skip to content
Will Skora edited this page Mar 18, 2020 · 20 revisions

City Council Records

Summary

Goals: Immediate Goal: create a searchable Record of The City Council Records

Presentation by Eamon Johnson (A broader overview of the project is available at https://opencleveland.org/projects/drocer.

This document is additional institutional knowledge that we've acquired in the process.

Core Team

Who are the core team members, their roles, and contact information?

(Will Skora; Eamon Johnson ; Andrew Plumb-Larrick; ; Anastasia Diamond-Ortiz )

Can't make it to Open Cleveland meetings ? Keep up to date with information through slack, an online chatroom.

Primary Stakeholders

Who are the primary stakeholders in this project? Who is invested in its success?

  • City of Cleveland City Council Clerk;
  • City of Cleveland Law Office (they often get public inquiries of how to obtain historical City of Cleveland laws - to quantify)
  • Cleveland Public Library and its Public Administration Library which houses all of the historical documents, ephemera that the City of Cleveland including Cleveland City Council, including old copies of the City Record.
  • City Council's Assistants;
  • City of Cleveland Archivists;
  • Residents, community organizers who wish to find out about city meetings in a more efficient and timely manner.

Primary User Stories

What needs does this project address, for each stakeholder, i.e “As a rider of the bus, I need to know when the bus will be at my stop”

City employees trying to find when a piece of legislation was passed (E.G. Tony Stella, 2015-04-06, mentioning how he had to go back to old cleveland.com articles, to find out the date when LGBT-friendly legislation was passed. Then he searched each pdf, one at a time around that date for LGBT terms.

Journalists: (from NIck Castele, WCPN) "And I don't need it super often, but every couple weeks when I do a city council story I'll try to read through City Record stuff. And mostly what I'm interested in is following legislation, searching by keyword or ordinance number When it was introduced, who voted on it, what's the language, that kind of thing."

Additional stakeholders: concerned residents, stakeholders interested in what legislation is happening in their neighborhoods; historians, academics;

Goals and Outcomes

_What are the desired outcomes and how will you measure them this project? ,

Upcoming, long-term Goals:

  • Structured data extraction for dates, people, parcels, locations, roles/committees , and legislation described in semi-structured / unstructured text of council records
  • scheduled refresh component to fetch latest record from city council site and initiate extraction
  • Browse interface for structured data
  • Search interface of full text search, linked to structured data
  • Search filters like some filters including date, Councilman, and parcel number and a web interface

2019 goals:

(Cleveland City Council finally implemented their own searchable digital City Council Legislation management system but it only searches from 2010-present even though it says it searches from 2003 to present).

Decide in the long-term how to reconcile the data that is loaded in Council's Legistar and the drocer web-app.

Determine priorities:

Whether to focus more attention on:

Creating a Councilmatic instance here in Cleveland that would scrape our Legistar Instance, the official place of Cleveland City Council Legislation.

What's councilmatic.org - Project done in several other cities that uses the legistar (a vendor of city legislation) that takes legislation and meeting notes in legistar and publishes it in in Philly, Los Angeles, NYC, and Chicago.

Smaller pre-steps:

  1. determine the unique uses, features of implementing Councilmatic
  2. Review the Open Civic Data standard (see https://github.com/datamade/councilmatic-starter-template for some information on it); also read this piece by Bill Hunt and do some investigation here.
  3. Ask the city to turn on the legistar API (Done - April 2019, they declined to; asked us what we wanted to do with it...)
  4. Tell the city why we want them to turn it on

The value of using Councilmatic:

OR

Take the pre-1996 legislation that we have on microfilm; scan it to digital TIF (determine what file quality, file naming schema, etc - ask CPL's Digitization team for some assistance); and then OCR (optical character recognition) so that they can become searchable); First process the yearly indexes and load it into the drocer-webapp

Resources Needed / How You Can Help

What are the technical, financial, and human resources needed, i.e. ‘a Rails instance’, ‘20 pieces of foamcore’, ‘someone with JavaScript skills’, etc.

  • Data extraction will require developers to implement components in Python

    • text extractor
    • sectionizer
    • hyphenation remover
    • council member identifier
    • ordinance / resolution extractor
    • PPN identifier
  • Front end (browse and search interface) will require HTML/CSS/JS

  • Development of schema and gold standard for testing will require knowledge of data and subject matter (City Council Legislation, City Legal Code)

  • Simple tasks that anyone can tackle, regardless of technical skills;

    •  Read over several PDFs to find some common language/patterns; 
      
    • Identify incorrect OCR/incorrect text extractions

Identify a couple small use cases that we could use for an initial prototype that we'd present to CPL, the clerk, to ensure that we're following use cases.

Potential Blockers

_What are the possible problems, issues, and blockers, i.e. “?

References Any references, data sources, articles, additional documents, etc.

Similar projects and resources:

https://talk.beta.nyc/c/working-groups/city-record-online

Bill Hunt's very insightful experience on spending 3 years working on the State Decoded, a scraper of State Laws.

Schema:

aka: How do we structure the raw data from the PDFs so we can build ?

Open Civic Data has developed a schema but the project lost most of its funding and momentum when Sunlight withdrew in 2014ish. The project is still being supported.

A good way to get in touch with them is through: https://groups.google.com/forum/#!forum/open-civic-data

This data is then collected daily using the Open Civic Data standard and platform

http://www.akomantoso.org/

Other:

councilmatic.org - Project done in several other cities that uses the legistar (a vendor of city legislation) that takes legislation and meeting notes in legistar and publishes it in in Philly, Los Angeles, NYC, and Chicago.

Parserator - tool to parse names and addresses out of text.

Formatting:

Describing the format, layout of the City Record

Comments from Cleveland Public Administration Librarians on searching The City Record REALLY VALUABLE

https://github.com/opencivicdata/scrapers-us-municipal

(developed by NYCBETA's efforts - https://docs.google.com/document/d/1USFMTHfrmBzDvNW08b2f6osyl9I375d7h47uGcvxXjY/edit[ ](https://docs.google.com/document/d/1USFMTHfrmBzDvNW08b2f6osyl9I375d7h47uGcvxXjY/edit)

===============================

Terminology:

Ordinances:

Resolutions: they do not have any legal binding, they are not laws. They're used to express the opinions of Council members.

Index: In previous years (cite which), the City Record would have a yearly index, a separate bound file that contained the page number and subject of a particular topic; extremely useful for cross-referencing and search past years' city records.

======

For microfilms; pre 1995 legislation:

3RD floor - public library (main building)

CPL has about 1-2 reel of microfilm for each year; about 1 hour to scan each reel (automated); software used @CPL (3rd floor) can output in TIFF/ TIFF LZW/ pdf / pdf grayscale/ JPEG with varying levels of compression; for reference: JPG images from June-Dec. 1995 total 5GB; raw) (will skora has these files)

======

http://www.cleveland.com/cityhall/index.ssf/2014/06/is_cleveland_city_council_wise.html