Skip to content
Creates a datastore for information on 527s, political organizations which disclose donors and expenditures semiannually with the IRS and can raise and spend unlimited amounts of money without filing reports with the Federal Election Commission.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
sql
LICENSE
README.md
file_parser.py
prep_files.sh
requirements.txt
run_this.py

README.md

o_o poindexter o_o

A tool for wrangling IRS data on 527s

Poindexter does the following:

  • Downloads and extracts the IRS' bulk Political Organization Filing and Disclosure data file
  • Cleans this file of database errors, errant DOS and UNIX line endings (there are both), and other cruft
  • Repairs lines broken by unsupported characters in the IRS' database dump
  • Logs all the weirdness it encounters and repairs
  • Writes the results into a series of CSVs, one for each table described in the IRS data documentation here

Poindexter comes complete with the sql statements necessary to make the corresponding tables in a Postgres database.

To download the bulk data: ./prep_files.sh

To generate the flatfiles into a directory called 'csvs' -- which should exist in the working directory -- using default settings: ./run_this.py & tail -f filemaker.log

From there, you're on your own; SQL scripts are included in sql/ that will create tables in Postgresql one could populate from the flatfiles with a COPY FROM command.

Poindexter should log an error when it encounters a row it can't handle.

You can’t perform that action at this time.