Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
159 lines (105 sloc) 3.88 KB

Exploring the FEC Expenditures

The Independent Expenditure and Candidate Disbursements data can be found at at these two URLs:

Fun with Python

Let's first explore the Independent Expenditures data. There's a clickable icon for getting the data in CSV format; I've hardcoded it below for 2016:

from datetime import datetime
from collections import defaultdict
import csv
import requests
url = 'http://www.fec.gov/data/IndependentExpenditure.do?format=csv&election_yr=2016'

The data doesn't come in perfect form; notably, we have to clean up the money amounts and date values. So I like creating two little helper functions:

# converts "$123,233.44" to 123233.44 (a float)
def clean_money(xstr):
    xs = xstr.replace('$', '').replace(',', '')
    return float(xs)

# converts '11/04/2016' to date object
def clean_date(xstr):
    return datetime.strptime(xstr, '%m/%d/%Y')

Download and de-serialize

Download the data and de-serialize it:

resp = requests.get(url)
rows = list(csv.DictReader(resp.text.splitlines()))

The data fields will sometimes have stray spaces, so, for each row (and remember that each row is a dictionary), we reassign the value to be the string stripped of trailing whitespaces, i.e. "123 " is converted to "123":

Clean the data strings

for row in rows:
    # strip out trailing whitespaces
    for k in row.keys():
        row[k] = row[k].strip()

And then a second loop to fix up the date and dollar values:

Typecast the values

for row in rows:
    row['agg_amo'] = clean_money(row['agg_amo'])
    row['exp_amo'] = clean_money(row['exp_amo'])
    if row['rec_dat']:
        row['rec_dat'] = clean_date(row['rec_dat'])
    if row['dissem_dt']:
        row['dissem_dt'] = clean_date(row['dissem_dt'])

Of course, you could always just use one main loop and do all the cleaning in one go:

for row in rows:
    # strip out trailing whitespaces
    for k in row.keys():
        row[k] = row[k].strip()
    row['agg_amo'] = clean_money(row['agg_amo'])
    row['exp_amo'] = clean_money(row['exp_amo'])
    if row['rec_dat']:
        row['rec_dat'] = clean_date(row['rec_dat'])
    if row['dissem_dt']:
        row['dissem_dt'] = clean_date(row['dissem_dt'])

Some example queries:

To get the total spent on independent expenditures

sum([r['exp_amo'] for r in rows])

To find how much money was spent against Trump:

x = 0
for r in rows:
    if 'trump' in r['can_nam'].lower() and 'Oppose' == r['sup_opp']:
        x += r['exp_amo']  

The spenders against Trump:

For each spendername, track the spending:

spenders = defaultdict(int)
for r in rows:
    if 'trump' in r['can_nam'].lower() and 'Oppose' == r['sup_opp']:
        spenders[r['spe_nam']] += r['exp_amo']

Get spending by Senate state race

# total by senate state
senate_totals = defaultdict(int)
for row in rows:
    if row['can_off'] == 'S':
        st_name = row['can_off_sta']
        senate_totals[st_name] += int(row['exp_amo'])


for name, total in sorted(senate_totals.items(), key=lambda o: o[1]):
    print(str(total).rjust(12), name)

Get top Senate races by candidate disbursements

Here's a start (downloading from FTP requires a different library, but it's the same process):

from datetime import datetime
from collections import defaultdict
from urllib.request import urlopen
import csv
url = 'ftp://ftp.fec.gov/FEC/data.fec.gov/candidate_disbursement2016/all_senate.csv'

# download and deserialize the data
with urlopen(url, 'r') as u:
    lines = u.read().decode('utf-8').splitlines()

rows = list(csv.DictReader(lines))