# Gather donations information via OpenSecrets API
### Big picture:
For every legislator in listen given within https://github.com/unitedstates/congress-legislators, use OpenSecrets API endpoints to get information about sector level and top 10 industry level donations.

## Basic setup

In [None]:
import requests
import json
import yaml
import us
from tqdm import tqdm
from keys import opensecrets_key

## Load data

In [None]:
with open("../data/congress-legislators/alternate_formats/legislators-current.json") as f:
    legislators = json.load(f)

In [None]:
len(legislators)

## Gather industry Data

In [None]:
donations_industry = {}

In [None]:
for l in tqdm(legislators):
    try:
        opensecrets_id = l["id"]["opensecrets"]
        if opensecrets_id not in donations_industry.keys():
            p = {"method":"candIndustry","cid":opensecrets_id,"cycle":'2016',"apikey":opensecrets_key,"output":"json"}
            j = json.loads(requests.get("http://www.opensecrets.org/api/",params=p).text)
            donations_industry[opensecrets_id] = j
    
    # this is literally just lucas strange
    except KeyError:
        pass

In [None]:
len(donations_industry.items())

#### We get rate limited after 200 requests, so needs to be run in several chunks. Cell above won't make repeat requests.
Below is code for writing to and retrieving from a temp file in these instances.

In [None]:
with open("tmp/TEMP_donations_industry", 'w') as f:
    json.dump(donations_industry, f)

In [None]:
with open("tmp/TEMP_donations_industry", 'r') as f:
    donations_industry = json.load(f)

## Write finished sector dataset

In [None]:
with open("../data/donations/donations_industry.json", 'w') as f:
    json.dump(donations_industry, f)

## Gather sector Data

In [None]:
donations_sector = {}

In [None]:
for l in tqdm(legislators):
    try:
        opensecrets_id = l["id"]["opensecrets"]
        if opensecrets_id not in donations_sector.keys():
            p = {"method":"candSector","cid":opensecrets_id,"cycle":'2016',"apikey":opensecrets_key,"output":"json"}
            j = json.loads(requests.get("http://www.opensecrets.org/api/",params=p).text)
            donations_sector[opensecrets_id] = j

    # this is literally just lucas strange
    except KeyError:
        pass

In [None]:
len(donations_sector)

#### We get rate limited after 200 requests, so needs to be run in several chunks. Cell above won't make repeat requests.
Below is code for writing to and retrieving from a temp file in these instances.

In [None]:
with open("tmp/TEMP_donations_sector", 'w') as f:
    json.dump(donations_sector, f)

In [None]:
with open("tmp/TEMP_donations_sector", 'r') as f:
    donations_sector = json.load(f)

## Write finished sector dataset

In [None]:
with open("../data/donations/donations_sector.json", 'w') as f:
    json.dump(donations_sector, f)

## Originally tried to get list of legislators from OpenSecrets itself
#### But as it turns out, relevant endpoint only given 114th Congress.
Code for that method follows below:

In [None]:
def get_leg_from_state(state):
    legislators = []
    try:
        p = {"method":"getLegislators","apikey":opensecrets_key,"output":"json","id":state}
        r = requests.get("http://www.opensecrets.org/api/",params=p).text
        if state != "DC":
            for l in json.loads(r)['response']['legislator']:
                attributes = l['@attributes']
                attributes['state'] = state
                legislators.append(attributes)
        elif state == "DC":
            attributes = json.loads(r)['response']['legislator']['@attributes']
            attributes['state'] = state
            legislators.append(attributes)
        return legislators
    except: 
        return None

In [None]:
def get_all_leg():
    states = [s.abbr for s in us.states.STATES]
    all_leg = []
    for s in tqdm(states):
        leg = get_leg_from_state(s)
        if leg is not None:
            all_leg.extend(leg)
        else:
            print("couldn't get legislators for: " + s)
    return all_leg

In [None]:
all_leg = get_all_leg()

In [None]:
# easier to just do this manually
dc = get_leg_from_state('DC')

In [None]:
dc

In [None]:
len(all_leg)

Some investigation reveals that the number given above (539) is greater than expected (536) because OpenSecrets is also giving profiles for people who didn't finish their terms or started midway through a term. This eventually made me realize this endpoint is for the 114th Congress, not the current Congress.