Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
210 lines (140 sloc) 6.52 KB

Contributing People Data

Person data is maintained in the openstates/people repository. This repository contains YAML files with all the information on given individuals, as well as scripts to work with & maintain the data.

Also, please note that this portion of the project is in the public domain in the United States with all copyright waived via a CC0 dedication. By contributing you agree to waive all copyright claims.

Checking out

Fork and clone the people repository:

Repository overview

The repository consists of a few key components:

  • settings.yml Settings for state legislatures, including the number of seats, and current vacancies.
  • data/ Data files in YAML format on legislators, organized by state & status.
  • scripts Various scripts used to maintain the data.
  • scrape/ Experimental new people scrapers, work-in-progress.

To run a script using docker-compose you can run a command like:

docker-compose run --rm people ./scripts/

Common tasks

Updating legislator data by hand

Let's say you call a legislator and find out that they have a new phone number, contribute back!

See for details on the acceptable fields. If you're looking to add a lot of data but unsure where it fits feel free to ask via an issue and we can either amend the schema or make a recommendation.

  1. Start a new branch for this work
  2. Make the edits you need in the appropriate YAML file. Please keep edits to a minimum (e.g. don't re-order fields)
  3. Submit a PR, please describe how you came across this information to expedite review.

Retiring a legislator

  1. Start a new branch for this work
  2. Run ./scripts/ on the appropriate legislator file(s)
  3. Review the automatically edited files & submit a PR.

Updating an entire state via a scrape

Let's say a North Carolina has had an election & it makes sense to re-scrape everything for that state.

  1. Start a new branch for this work
  2. Scrape data using Open States' Scrapers
  3. Run ./scripts/ against the generated JSON data, this will populate the incoming/ directory
  4. Check for merge candidates using ./scripts/ --incoming nc
  5. Manually reconcile remaining changes, will almost certainly require some retirements as well.
  6. Check that data looks clean with ./scripts/ nc --summary and prepare a PR.

Updating a single field for many people

Let's say you want to add foobar_id to a ton of legislators from your own data set or similar.

TBD - We need to create a tool that will aid in this as it will prove a common use case & we can lower the barrier here.


Several scripts are provided to help maintain/check the data. [OPTIONS] INPUT_DIR

  Convert pupa scraped JSON in INPUT_DIR to YAML files for this repo.

Convert a pupa scrape directory to YAML.  Will put data into incoming/
directory for usage with's --incoming option. [OPTIONS] [ABBREVIATIONS]

  Lint YAML files, optionally also providing a summary of state's data.

  <ABBREVIATIONS> can be provided to restrict linting to select states.

  -v, --verbose
  --summary / --no-summary  Print summary after validation errors. [OPTIONS]

  Script to assist with merging legislator files.

  Can be used in two modes: incoming or file merge.

  Incoming mode analyzes incoming/ directory files (generated with and discovers identical & similar files to assist with

  File merge mode merges two legislator files.

  --incoming TEXT    Operate in incoming mode, argument should be state abbr
                     to scan.
  --retirement TEXT  Set retirement date for all people marked retired (in
                     incoming mode).
  --old TEXT         Operate in merge mode, this is the older of two files &
                     will be kept.
  --new TEXT         In merge mode, this is the newer file that will be
                     removed after merge.
  --help             Show this message and exit. [OPTIONS]

Create a new person record.

  Arguments can be passed via command line flags, omitted arguments will be

  Be sure to review the file and add any additional data before committing.

  --fname TEXT       First Name
  --lname TEXT       Last Name
  --name TEXT        Optional Name, if not provided First + Last will be used
  --state TEXT       State abbreviation
  --district TEXT    District
  --party TEXT       Party
  --rtype TEXT       Role Type
  --url TEXT         Source URL
  --image TEXT       Image URL

  Retire a legislator, given END_DATE and FILENAME.

  Will set end_date on active roles & committee memberships. [OPTIONS] [ABBREVIATIONS]

  Sync YAML files to DB.

  --purge / --no-purge  Purge all legislators from DB that aren't in YAML.
  --safe / --no-safe    Operate in safe mode, no changes will be written to
                        database. [OPTIONS] [ABBREVIATIONS]...

  Download images and sync them to S3.

  <ABBR> can be provided to restrict to single state.

  --skip-existing / --no-skip-existing  Skip processing for files that already exist
                                        on S3. (default: true) [ABBREVIATIONS]...

  Update CSVs of current legislators.

  <ABBR> can be provided to restrict to single state.
You can’t perform that action at this time.