# Final Project, IS 620

## Ranking Centrality of Senators and Bills

In this script, we will rank the centrality of senators and bills associated with the 115th Congress of the United States of America.  We define centrality as the measure of number of links via bill cosponsorship.  For example, a bill is more highly central if it has 20 cosponsors rather than a single sponsor.  A senator is more central if s/he is linked to 20 other senators via cosponsoring bills with them, rather than not sponsoring bills with anyone else.

We will construct a bipartite network with edges consisting of sponsorship (linking senators with bills). Bill centrality can be measured by the number of cosponsors.  We will then map sponsorship links to a senator-only graph and rank senators by the number of edges connecting them to other senators (cosponsorship of the same bill).  

### Import Needed Packages

In [1]:
import requests
import json
import networkx as net
import networkx.algorithms.bipartite as bipartite
import pandas as pd
import time
import matplotlib.pyplot as plt
%matplotlib inline

### Set Up Function to Use ProPublica Congress API

In [2]:
def propublica(url):
    r = requests.get(url, headers={'X-API-Key': 'LykoKm3xz89O6hmvgUAY66coaqYT4wP14wJjepib'})  
    data = json.loads(r.text)
    return (data)

### Get List of Senators from Propublica API

In [3]:
senators = propublica("https://api.propublica.org/congress/v1/115/senate/members.json")['results'][0]['members']
df_senators = pd.DataFrame(senators)
df_senators.head()

Unnamed: 0,api_uri,contact_form,crp_id,cspan_id,date_of_birth,dw_nominate,facebook_account,fax,fec_candidate_id,first_name,...,state,state_rank,title,total_present,total_votes,twitter_account,url,votes_with_party_pct,votesmart_id,youtube_account
0,https://api.propublica.org/congress/v1/members...,http://www.alexander.senate.gov/public/index.c...,N00009888,5,1940-07-03,0.323,senatorlamaralexander,202-228-3398,S2TN00058,Lamar,...,TN,senior,"Senator, 2nd Class",0,280,SenAlexander,https://www.alexander.senate.gov/public/,97.79,15691,lamaralexander
1,https://api.propublica.org/congress/v1/members...,https://www.baldwin.senate.gov/feedback,N00004367,57884,1962-02-11,-0.546,TammyBaldwin,202-225-6942,S2WI00219,Tammy,...,WI,junior,"Senator, 1st Class",1,280,SenatorBaldwin,https://www.baldwin.senate.gov/,95.71,3470,witammybaldwin
2,https://api.propublica.org/congress/v1/members...,https://www.barrasso.senate.gov/public/index.c...,N00006236,1024777,1952-07-21,0.528,johnbarrasso,202-224-1724,S6WY00068,John,...,WY,junior,"Senator, 1st Class",0,280,SenJohnBarrasso,https://www.barrasso.senate.gov/,98.21,52662,barrassowyo
3,https://api.propublica.org/congress/v1/members...,https://www.bennet.senate.gov/?p=contact,N00030608,1031622,1964-11-28,-0.208,senbennetco,202-228-5097,S0CO00211,Michael,...,CO,senior,"Senator, 3rd Class",1,280,SenBennetCo,https://www.bennet.senate.gov/,91.76,110942,SenatorBennet
4,https://api.propublica.org/congress/v1/members...,https://www.blumenthal.senate.gov/contact/,N00031685,21799,1946-02-13,-0.418,SenBlumenthal,202-224-9673,S0CT00177,Richard,...,CT,senior,"Senator, 3rd Class",1,280,SenBlumenthal,https://www.blumenthal.senate.gov/,91.04,1568,SenatorBlumenthal


### Get Recent Active Bills

The API will only send 20 at a go, so we have to paginate!

In [4]:
url = "https://api.propublica.org/congress/v1/115/senate/bills/active.json?offset="
offset = 0
apiResults = propublica(url + str(offset))
activeBills = apiResults['results'][0]['bills']
while True:
    offset = offset + 20
    apiResults = propublica(url + str(offset))
    moreBills = apiResults['results'][0]['bills']
    if len(moreBills) == 0:
        break
    activeBills += moreBills
    time.sleep(.25)  # Be friendly and don't overwhelm the server with a bunch of fast queries

In [6]:
df_activebills = pd.DataFrame(activeBills)
df_activebills.head()

Unnamed: 0,active,bill_id,bill_slug,bill_type,bill_uri,committee_codes,committees,congressdotgov_url,cosponsors,cosponsors_by_party,...,sponsor_name,sponsor_party,sponsor_state,sponsor_title,sponsor_uri,subcommittee_codes,summary,summary_short,title,vetoed
0,True,sres337-115,sres337,sres,https://api.propublica.org/congress/v1/115/bil...,[],,https://www.congress.gov/bill/115th-congress/s...,2,"{u'R': 1, u'D': 1}",...,Johnny Isakson,R,GA,Sen.,https://api.propublica.org/congress/v1/members...,[],,,"A resolution designating November 26, 2017, as...",
1,True,sres338-115,sres338,sres,https://api.propublica.org/congress/v1/115/bil...,[],,https://www.congress.gov/bill/115th-congress/s...,1,{u'R': 1},...,John Cornyn,R,TX,Sen.,https://api.propublica.org/congress/v1/members...,[],,,A resolution commending and congratulating the...,
2,True,sres339-115,sres339,sres,https://api.propublica.org/congress/v1/115/bil...,[],,https://www.congress.gov/bill/115th-congress/s...,5,"{u'R': 2, u'D': 3}",...,Tammy Duckworth,D,IL,Sen.,https://api.propublica.org/congress/v1/members...,[],,,"A resolution designating November 2017 as ""Nat...",
3,True,sres340-115,sres340,sres,https://api.propublica.org/congress/v1/115/bil...,[],,https://www.congress.gov/bill/115th-congress/s...,2,{u'R': 2},...,David Perdue,R,GA,Sen.,https://api.propublica.org/congress/v1/members...,[],,,A resolution commemorating the 100th anniversa...,
4,True,s2099-115,s2099,s,https://api.propublica.org/congress/v1/115/bil...,[SSAF],"Senate Agriculture, Nutrition, and Forestry Co...",https://www.congress.gov/bill/115th-congress/s...,1,{u'D': 1},...,Pat Roberts,R,KS,Sen.,https://api.propublica.org/congress/v1/members...,[],,,A bill to provide for the management by the Se...,


### Create a Graph of Senators and Bills Using NetworkX

First, we'll simplify our Senators and Bills data!

#### Simplify Senators

In [7]:
simplifiedSenators = [dict.fromkeys(["id", "first_name", "middle_name", "last_name", "party", \
                                     "seniority", "state_rank", "facebook_account", "twitter_account"])]

for senator in senators:
    simplifiedSenator = {key: senator[key] for key in senator if key in \
                      ['id','first_name','middle_name','last_name','party',\
                       'seniority','state_rank', "facebook_account", "twitter_account"]}
    simplifiedSenators.append(simplifiedSenator)

#### Remove "None" row
That "None" row is going to cause a problem! I'll remove it.

In [8]:
simplifiedSenators.pop(0)
pd.DataFrame(simplifiedSenators).head()

Unnamed: 0,facebook_account,first_name,id,last_name,middle_name,party,seniority,state_rank,twitter_account
0,senatorlamaralexander,Lamar,A000360,Alexander,,R,15,senior,SenAlexander
1,TammyBaldwin,Tammy,B001230,Baldwin,,D,5,junior,SenatorBaldwin
2,johnbarrasso,John,B001261,Barrasso,,R,11,junior,SenJohnBarrasso
3,senbennetco,Michael,B001267,Bennet,,D,9,senior,SenBennetCo
4,SenBlumenthal,Richard,B001277,Blumenthal,,D,7,senior,SenBlumenthal


#### Simplify Bills, Remove "None" Row

In [None]:
simplifiedBills = [dict.fromkeys(['bill_slug','title','sponsor_id','primary_subject','sponsor_party'])]
for activeBill in activeBills:
    simplifiedBill = {key: activeBill[key] for key in activeBill if key in \
                      ['bill_slug','title', 'sponsor_id','primary_subject','sponsor_party']}
    simplifiedBills.append(simplifiedBill)
simplifiedBills.pop(0)
pd.DataFrame(simplifiedBills).head()

Now we have simplified data that's nice and clean.  Let's create a graph that links senators to the bills they sponsored.

In [None]:
g = net.Graph()
for senator in simplifiedSenators:
    # I want to distinguish between senators and bills, so I'll add this "bipartite" attribute.
    g.add_node(senator['id'], attr_dict=senator, bipartite=0)  
for bill in simplifiedBills:
    # I want to distinguish between senators and bills, so I'll add this "bipartite" attribute.
    g.add_node(bill['bill_slug'], attr_dict=bill, bipartite = 1)
    g.add_edge(bill['sponsor_id'],bill['bill_slug'])

## Get Bill Cosponsors

Bills also have cosponsors, in some cases, so we should have those links in place as well! 

First, find the bills that had some cosponsors but that weren't runaway, whole-party or whole-senate cosponsorships. We're looking for the sweet spot between, say, 1 and 20 co-sponsors, where the group of sponsors is small enough that working together on the bill would be an actual social connection, not just a rubber-stamp.

### Get Cosponsored Bills

In [None]:
df_activeBills = pd.DataFrame(activeBills)
cosponsoredBills = df_activeBills[(df_activeBills['cosponsors'] > 0) & (df_activeBills['cosponsors'] <= 20)]
cosponsoredBills = cosponsoredBills[['bill_slug','sponsor_id', 'cosponsors']]
cosponsoredBills.head()

Then, lookup the info on each cosponsored bill, collect all the cosponsors of that bill and link cosponsors to bills with edges.

### Create Edges for Cosponsorship

For each cosponsored bill, link each bill and its cosponsoring senators.