Skip to content

Data relationships Database schema

Ben Cipollini edited this page Dec 31, 2015 · 28 revisions

A proposed database / Django app design / API endpoint schema

Overview

Goals

The goal is to make a modular data architecture such that meaningful, self-contained apps containing subsets of data can be defined, and the interactions across the apps can be simple, minimal, and meaningful.

Each app should expose data feeds limited to that app's scope. For higher-level feeds, such as disclosure information, information across multiple apps is combined. Such an app need not store any data of it's own.

Schema / apps

Below, there are five apps:

  • locality - different ways to reference localities (cities, states, jurisdictions)
  • ballot - generic balloting definition for Y/N and multiple-choice ballots (including elections); uses locality app
  • candidate - metadata about ballot measures that are elections, ballot measure choices that are candidates; uses locality and ballot apps
  • contributions - raw data for political contributions from committees. This is where most of the back-end work is; uses locality and ballot apps
  • disclosure - feeds for querying contribution data by locality, candidate, ballot, etc. This is what the front-end folks will care most about, but will be easy for the back-end folks; uses locality, ballot, candidate, and contributions apps

Some key ideas:

  • disclosure contains no data; it just JOINs data from other apps and exposes the most important feeds for our app. We should keep as few feeds in disclosure as necessary--avoid bloat!
  • ballot is generic for any multiple-choice ballot measure, including elections for offices. This makes connecting money to candidates and any other ballot measure simple.
  • There are many ways to query data based on locality. locality:Jurisdiction contains fips_id, and so is the core linkage system there. Anything else (zip codes, cities, counties, "combined statistical areas" will need to resolve to fips_ids

Looking forward:

Additional apps (for completion, not listed below):

  • netfile_raw - would be great to pull this out (like calaccess_raw) as a separate project later; if we can make it a clean, independent Django app now, it would be easy to do so in the future!
  • zipcode_metro_raw - again, this could be it's own separate project if we like--perhaps abstract out parts of locality above to make a reusable component for CSA/PSA.

Proposed Apps and their feeds/models

locality app

Different ways to reference localities (precinct, city, county, state, as well as jurisdiction, etc)

Dependencies: None

Feeds

Data tables

Locality - Generic table capable to hold every type of "thing", and give it a unique ID

  • ID - some globally unique ID (could be FIPS, could be arbitrary)

Precinct - smallest unit, corresponds to voting area (where do we get these? are they in the data?)

  • Name
  • Number
  • City:ID
  • County:ID
  • State:ID
  • FIPS_ID?
  • Locality:ID

ZipCode - 41,865 of 'em

  • ZipCode
  • City:ID
  • County:ID
  • Locality:ID

City

  • ID
  • Name
  • County:ID
  • State:ID
  • FIPS_ID
  • Locality:ID

County

  • ID
  • Name
  • State:ID
  • FIPS_ID
  • Locality:ID

State

  • ID (FIPS ID)
  • Name
  • Locality:ID

PSA - Primary Statistical Area - 572 of 'em, nice way to aggregate data.

  • ID
  • Name
  • ZipCode
  • Locality:ID

ballot app

Generic balloting definition for Y/N and multiple-choice ballots, including elections

Dependencies: localities

Feeds

  • /ballot/?date=????/??/?? - all ballots for a specific date.
  • /ballot/:locality_id - all elections for a specific location.
  • /ballot/:ballot_id - raw ballot info, plus all ballot measures (and choices?)
  • /ballot/measure/:ballot_measure_id (Mockup 3) - ballot measure info & choices

Data tables

Ballot/Vote/Contest - A voter's ballot, containing all the BallotItems that the voter will vote on in the voting booth on an election day.

  • Locality:ID
  • Date

BallotMeasure also called contest, initiative, referendum, proposition; a single item on the ballot.

  • ID
  • Ballot:ID
  • Text

BallotMeasureChoice (if ballot measure is binary, you'll have two. You should never have more than one)

  • ID
  • BallotMeasure:ID
  • Text
  • Pro_statement
  • Con_statement

candidate app

Metadata about ballot measures representing elections

Dependencies: locality, ballot

Feeds

  • /candidates/ballot/:ballot_id - all candidate info for a specific ballot
  • /candidates/measure/:ballot_measure_id - all candidate info for a specific office
  • /candidate/:candidate_id - detailed information on candidate (party, location, history of running for office)
  • /election/:ballot_id - all info related to a specific election
  • /elections/<type>/:locality_id/ - find all elections in a specific locality (jurisdiction, city, state, PSA, etc.)

Data tables

Person

  • ID
  • Name
  • PartyAffiliation:ID
  • Locality:ID
  • Contact info

Candidate

  • ID
  • Person:ID
  • Office:ID
  • BallotMeasureChoice:ID (gives jurisdiction)

Office

  • ID
  • Locality:ID
  • Other data

Election -

  • ID
  • Office:ID
  • BallotMeasureChoice:ID

contributions app

Raw data for political contributions. This is where most of the back-end work is.

Dependencies: locality, ballot

Feeds

  • /contributions/measure/:ballot_measure_id - all contributions (support&oppose) for all choices in a given ballot measure
  • /contributions/choice/:ballot_measure_choice_id - all contributions (support&oppose) for a single choice in a given ballot measure
  • /contributions/committee/:committee_id - all contributions ever made by a specific committee.
  • /committee/:committee_id - metadata (name, etc) about committee

Data tables

Forms

  • ID
  • Type (city/county/state)

Committee

  • ID
  • Name
  • Jurisdiction:ID (can be null)
  • Contact Info

Contributor

  • ID
  • Name
  • Type (can be individual, committee, company, etc)
  • ForeignKey (for entities)

Contribution

  • Form:ID (needed for purging procedure)
  • PeriodStart (start date of contribution period during which this was received)
  • PeriodEnd
  • Amount
  • Committee:ID
  • BallotMeasure:ID (refers to jurisdiction)
  • BallotMeasureChoice:ID (can be null for binary)
  • Support/Oppose
  • Contributor:ID

ReportingPeriod

  • Ballot:ID (refers to jurisdiction)
  • PeriodStart
  • PeriodEnd

disclosure app

Feeds for querying contribution data by locality, candidate, ballot, etc. This is what the front-end folks will care most about, but will be easy for the back-end folks.

Dependencies: locality, ballot, candidate, contributions, calaccess_raw, netfile_raw

Feeds

  • /city/:city_id (Mockup 5) - returns city name, high-level totals of money within the city's election.
  • /city/:city_id/ballot (Front-end issue #25) - returns the combined contests of all jurisdictions within the city
  • /ballot/:ballot_measure_id (Mockup 2, Mockup 3; Front-end issue #27) - contribution totals and contributors, per referendum
  • /committee/:committee_id (Mockup 1) - Metadata & high-level contribution information (ballot measures, localities, etc)
  • /committee/disclosure/:committee_id (Mockup 8) - Showing contributions to a committee

Additional feeds

Seems this is best handled by another app, combining ballot & locality and independent of funding. Keeping here for the moment, to avoid adding a new app just for this.

Data tables

No models.