# SI 608 Project – Workspace
<span style="font-size: 18px;">General scratchpad workspace that preloads all the dataframes.</span>
<br>See <code>./modules</code> to review how libraries are installed and imported, as well as where the data is loaded, cleaned, and formatted. This is only here as a helpful tool, make a copy and do whatever you'd like. Or don't use this at all if that's preferable.

[OpenSecrets Data Dictionary Index](../../docs/open_source_data_dictionary.md)
<br><small><em>(View the index with markdown preview)</em></small>

## Environment

#### Settings
Configure certain behaviors in this notebook.

In [172]:
DISPLAY_DF = True # for showdf() -> df.head()
SAVE_DF = True # for to_csv() -> pd.to_csv()

#### Initialize
Init file contains helper functions used throughout the project.

In [174]:
%run modules/init.ipynb

Initializing project...
pandas is already installed.
matplotlib is already installed.
networkx is already installed.
numpy is already installed.
...initialization complete.


#### Datasets

This module provides a single function for all of the *contribution* data from OpenSecrets.

In [176]:
%run modules/data.ipynb

Loading data module...
...data module loaded.


---
## Data

### 527 data

#### cmtes527

In [180]:
# # OpenSecrets Data Definition: 527 Committees
# # https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20527%20Cmtes.htm
# columns_cmtes527 = ['cycle', 'rpt', 'ein', 'crp527name', 'affiliate', 'ultorg', 
#                     'recipcode', 'cmteid', 'cid', 'eccmteid', 'party', 
#                     'primcode', 'source', 'ffreq', 'ctype', 'csource', 'viewpt',
#                     'comments', 'state']

# if not os.path.exists('../../data/open_secrets/527/cmtes527.csv'):
#     process_data('../../data/open_secrets/527/cmtes527.txt', n_expected_fields=len(columns_cmtes527), headers=columns_cmtes527, show_errs=False)

# df_cmtes527 = pd.read_csv('../../data/open_secrets/527/cmtes527.csv', on_bad_lines='skip')

In [181]:
# showdf(df_cmtes527)

#### expends527

In [183]:
# # OpenSecrets Data Dictionary 527 Expenditure Data - from IRS Form 8872B
# # https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20527%20Expenditures.htm
# columns_expends527 = ['rpt', 'formid', 'schbid', 'orgname', 'ein', 'recipient', 
#                     'recipientcrp', 'amount', 'date', 'expcode', 'source', 
#                     'purpose', 'addr1', 'addr2', 'city', 'state', 'zip',
#                     'employer', 'occupation']

# if not os.path.exists('../../data/open_secrets/527/expends527.csv'):
#     process_data('../../data/open_secrets/527/expends527.txt', headers=columns_expends527, n_expected_fields=len(columns_cmtes527), show_errs=False)

# df_expends527 = pd.read_csv('../../data/open_secrets/527/expends527.csv', on_bad_lines='skip')

In [184]:
# showdf(df_expends527)

#### rcpts527

In [186]:
# # OpenSecrets Data Dictionary 527 Contribution Data - from IRS Form 8872A
# # https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20527%20Receipts.htm
# columns_rcpts527 = ['id', 'rpt', 'formid', 'schaid', 'contribid', 'contrib', 
#                     'amount', 'date', 'orgname', 'ultorg', 'realcode', 
#                     'recipid', 'recipcode', 'party', 'recipient', 'city', 'state',
#                     'zip', 'zip4', 'pmsa', 'employer', 'occupation', 'ytd', 'gender', 'source']

# if not os.path.exists('../../data/open_secrets/527/rcpts527.csv'):
#     process_data('../../data/open_secrets/527/rcpts527.txt', headers=columns_rcpts527, n_expected_fields=len(columns_rcpts527), show_errs=False)

# df_rcpts527 = pd.read_csv('../../data/open_secrets/527/rcpts527.csv', on_bad_lines='skip')

In [187]:
# showdf(df_rcpts527)

---
### Campaign Finance 18 data
#### cands18

In [189]:
# OpenSecrets Data Definition: Candidates
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20Candidates%20Data.htm
columns_cands18 = ['cycle', 'feccandid', 'cid', 'firstlastp', 'party', 'distidrunfor', 
                    'distidcurr', 'currcand', 'cyclecand', 'crpico', 'recipcode', 
                    'nopacs']

if not os.path.exists('../../data/open_secrets/CampaignFin18/cands18.csv'):
    process_data('../../data/open_secrets/CampaignFin18/cands18.txt', headers=columns_cands18, n_expected_fields=len(columns_cands18), show_errs=False)

df_cands18 = pd.read_csv('../../data/open_secrets/CampaignFin18/cands18.csv', on_bad_lines='skip')

# Remove party labels from names: '3', 'R', 'D', 'I', 'L', 'U', 'i'
df_cands18['firstlast__cands18'] = df_cands18['firstlastp__cands18'].apply(
    lambda x: x.replace(" (3)", "").replace(" (R)", "").replace(" (D)", "").replace(" (I)", "").replace(" (L)", "").replace(" (U)", "").replace(" (i)", "") if isinstance(x, str) else x
)

In [190]:
showdf(df_cands18)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18
0,2018,H0AL02087,N00030768,Martha Roby (R),R,AL02,AL02,Y,Y,I,RW,,Martha Roby
1,2018,H0AL03192,N00043592,Hannah Thompson (D),D,AL03,,,,,DN,,Hannah Thompson
2,2018,H0AL05049,N00003042,Bud Cramer (D),D,AL05,,,,,DN,,Bud Cramer


#### cmtes18

In [192]:
# OpenSecrets Table Definition: Committee table
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20for%20Cmtes.htm
columns_cmtes18 = ['cycle', 'cmteid', 'pacshort', 'affiliate', 'ultorg', 'recipid', 
                    'recipcode', 'feccandid', 'party', 'primcode', 'source', 'sensitive',
                    'foreign', 'active']

if not os.path.exists('../../data/open_secrets/CampaignFin18/cmtes18.csv'):
    process_data('../../data/open_secrets/CampaignFin18/cmtes18.txt', headers=columns_cmtes18, n_expected_fields=len(columns_cmtes18), show_errs=False)

df_cmtes18 = pd.read_csv('../../data/open_secrets/CampaignFin18/cmtes18.csv', on_bad_lines='skip')

In [193]:
print(len(df_cmtes18))
showdf(df_cmtes18)

19270


Unnamed: 0,cycle__cmtes18,cmteid__cmtes18,pacshort__cmtes18,affiliate__cmtes18,ultorg__cmtes18,recipid__cmtes18,recipcode__cmtes18,feccandid__cmtes18,party__cmtes18,primcode__cmtes18,source__cmtes18,sensitive__cmtes18,foreign__cmtes18,active__cmtes18
0,2018,C00000018,IRONWORKERS LOCAL UNION NO. 25 POLITICAL EDUCA...,,,C00000018,,H8TX22313,,,,,0,0
1,2018,C00000059,Hallmark Cards,,Hallmark Cards,C00000059,PB,,,C1400,WebAM,n,0,1
2,2018,C00000422,American Medical Assn,American Medical Assn,American Medical Assn,C00000422,PB,,,H1100,AFP88,n,0,1


#### pac_other18

In [195]:
# OpenSecrets Data Definition for PAC to PAC Data (Pac_other table)
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20PAC%20to%20PAC%20Data.htm
columns_pac_other18 = ['cycle', 'fecrecno', 'filerid', 'donorcmte', 'contriblendtrans', 'city', 'state', 
                            'zip', 'fecoccemp', 'primcode', 'date', 'amount', 'recipid', 'party', 'otherid',
                            'recipcode', 'recipprimcode', 'amend', 'report', 'pg', 'microfilm', 'type',
                            'realcode', 'source']

if not os.path.exists('../../data/open_secrets/CampaignFin18/pac_other18.csv'):
    process_data('../../data/open_secrets/CampaignFin18/pac_other18.txt', headers=columns_pac_other18, n_expected_fields=len(columns_pac_other18), show_errs=False)

df_pac_other18 = pd.read_csv('../../data/open_secrets/CampaignFin18/pac_other18.csv', on_bad_lines='skip')

In [196]:
showdf(df_pac_other18)

Unnamed: 0,cycle__pac_other18,fecrecno__pac_other18,filerid__pac_other18,donorcmte__pac_other18,contriblendtrans__pac_other18,city__pac_other18,state__pac_other18,zip__pac_other18,fecoccemp__pac_other18,primcode__pac_other18,date__pac_other18,amount__pac_other18,recipid__pac_other18,party__pac_other18,otherid__pac_other18,recipcode__pac_other18,recipprimcode__pac_other18,amend__pac_other18,report__pac_other18,pg__pac_other18,microfilm__pac_other18,type__pac_other18,realcode__pac_other18,source__pac_other18
0,2018,1010320180036112556,C00637983,"Nardolillo, Bobby",ROBERT A NARDOLLILO III,GREENE,RI,2827.0,NARDOLILLO FUNERAL HOME,Z1100,03/03/2017,2000.0,N00040819,R,S8RI00110,RN,Z1100,A,Q2,P,201707200200233820,16C,Z1100,PAC
1,2018,1010320180036112568,C00637983,Bobby for Senate,LEADERSHIP CONNECTICUT PAC,TRUMBULL,CT,6611.0,,Z1100,06/26/2017,280.0,C00499863,,C00499863,PI,J1100,A,Q2,P,201707200200233825,24G,Z1100,PAC
2,2018,1010320180036112716,C00443218,Wicker Majority Fund,WICKER MAJORITY FUND,JACKSON,MS,39205.0,,Z4100,09/30/2017,3323.0,N00003280,R,C00646380,RW,Z1100,A,Q3,G,201710200200357062,18G,Z4100,PAC


#### pacs18

In [198]:
# OpenSecrets Data Definition: PAC table (PACs to Candidates)
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20for%20PAC%20to%20Cands%20Data.htm
columns_pacs18 = ['cycle', 'fecrecno', 'pacid', 'cid', 'amount', 'date', 'realcode', 
                            'type', 'di', 'feccandid']

if not os.path.exists('../../data/open_secrets/CampaignFin18/pacs18.csv'):
    process_data('../../data/open_secrets/CampaignFin18/pacs18.txt', headers=columns_pacs18, n_expected_fields=len(columns_pacs18), show_errs=False)

df_pacs18 = pd.read_csv('../../data/open_secrets/CampaignFin18/pacs18.csv', on_bad_lines='skip')

In [199]:
showdf(df_pacs18)

Unnamed: 0,cycle__pacs18,fecrecno__pacs18,pacid__pacs18,cid__pacs18,amount__pacs18,date__pacs18,realcode__pacs18,type__pacs18,di__pacs18,feccandid__pacs18
0,2018,4101920171458697439,C00411553,N00035278,1000,09/18/2017,H1100,24K,D,H4MA05084
1,2018,4061620171409955636,C00142711,N00004357,2500,05/15/2017,D2000,24K,D,H8WI01024
2,2018,4020620181503867907,C00439521,N00034584,2000,09/18/2017,J2200,24K,D,H4ME02234


#### indivs18

In [201]:
# OpenSecrets Data Definition: Individual Contribution Data
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20for%20Individual%20Contribution%20Data.htm
columns_indivs18 = ['cycle', 'fectransid', 'contribid', 'contrib', 'recipid', 'orgname', 
                    'ultorg', 'realcode', 'date', 'amount', 'street', 'city', 'state',
                    'zip', 'recipcode', 'type', 'cmteid', 'otherid', 'gender', 'microfilm',
                    'occupation', 'employer', 'source']

# This dataset is huge, and crashes my computer.
# Takes 6.5min to read the file.

if not os.path.exists('../../data/open_secrets/CampaignFin18/indivs18.csv'):
    process_data('../../data/open_secrets/CampaignFin18/indivs18.txt', headers=columns_indivs18, n_expected_fields=len(columns_indivs18), show_errs=False)

df_indivs18 = pd.read_csv('../../data/open_secrets/CampaignFin18/indivs18.csv', on_bad_lines='skip')

In [202]:
showdf(df_indivs18)

Unnamed: 0,cycle__indivs18,fectransid__indivs18,contribid__indivs18,contrib__indivs18,recipid__indivs18,orgname__indivs18,ultorg__indivs18,realcode__indivs18,date__indivs18,amount__indivs18,street__indivs18,city__indivs18,state__indivs18,zip__indivs18,recipcode__indivs18,type__indivs18,cmteid__indivs18,otherid__indivs18,gender__indivs18,microfilm__indivs18,occupation__indivs18,employer__indivs18,source__indivs18
0,2018,1010320180036112450,q0000744792,"CAZZANI, SERAFINO V",N00040819,Cazzani Power Boats,,Y4000,08/04/2017,1000,,CRANSTON,RI,2920,RN,15,C00637983,,M,201710180200332243,,,
1,2018,1010320180036112466,p0004650540,"JONES, KENNETH",N00040819,Kenneth Jones Construction,,B1500,08/07/2017,250,,WEST GREENWICH,RI,2817,RN,15,C00637983,,M,201710180200332248,,,Name
2,2018,1010320180036112472,i3003283827@,"NARDOLILLO, KIM",N00040819,Nardolillo Funeral Home,,G5400,09/30/2017,1360,,NARRAGANSETT,RI,2882,RN,15,C00637983,,F,201710180200332250,,NARDOLILLO FUNERAL HOME,Name


---
### Expends18 data

#### expends18

In [205]:
# # OpenSecrets Data Dictionary for Expenditure Data - from FEC electronic filings
# # https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20Expenditures.htm
# columns_expends18 = ['cycle', 'id', 'transid', 'crpfilerid', 
#                      'recipcode', 'pacshort', 'crprecipname', 
#                      'expcode', 'amount', 'date', 'city', 'state', 
#                      'zip', 'cmteid_ef', 'candid', 'type',
#                      'descrip', 'pg', 'elecother', 'enttype',
#                      'source']

# if not os.path.exists('../../data/open_secrets/Expend18/expends18.csv'):
#     process_data('../../data/open_secrets/Expend18/expends18.txt', headers=columns_expends18, nrows=1000, n_expected_fields=len(columns_expends18), show_errs=False)

# df_expends18 = pd.read_csv('../../data/open_secrets/Expend18/expends18.csv', on_bad_lines='skip', nrows=1000)

In [206]:
# # All pac expenditures
# showdf(df_expends18)

---
### Lobby data

#### lob_agency

In [209]:
# OpenSecrets Data Definition for Lobbying Data: Lobby agencies
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_agency.htm
columns_lob_agency = ['uniqid', 'agencyid', 'agency']

if not os.path.exists('../../data/open_secrets/Lobby/lob_agency.csv'):
    process_data('../../data/open_secrets/Lobby/lob_agency.txt', headers=columns_lob_agency, n_expected_fields=len(columns_lob_agency), show_errs=False)

df_lob_agency = pd.read_csv('../../data/open_secrets/Lobby/lob_agency.csv', on_bad_lines='skip')

In [210]:
showdf(df_lob_agency)

Unnamed: 0,uniqid__lob_agency,agencyid__lob_agency,agency__lob_agency
0,BB7367A7-7B60-4DED-AA2D-A94771A9EBE8,1,US Senate
1,BB7367A7-7B60-4DED-AA2D-A94771A9EBE8,2,US House of Representatives
2,04366C6F-B0CE-4C28-87BF-EE1CC8A9BB41,2,US House of Representatives


#### lob_bills

In [212]:
# OpenSecrets Data Definition for Lobbying Data: Lobby bills
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_bills.htm
columns_lob_bills = ['b_id', 'si_id', 'congno', 'bill_name']

if not os.path.exists('../../data/open_secrets/Lobby/lob_bills.csv'):
    process_data('../../data/open_secrets/Lobby/lob_bills.txt', headers=columns_lob_bills, n_expected_fields=len(columns_lob_bills), show_errs=False)

df_lob_bills = pd.read_csv('../../data/open_secrets/Lobby/lob_bills.csv', on_bad_lines='skip')
df_lob_bills['bill_name__lob_bills'] = df_lob_bills['bill_name__lob_bills'].apply(lambda x: x[:-2])

In [213]:
showdf(df_lob_bills)

Unnamed: 0,b_id__lob_bills,si_id__lob_bills,congno__lob_bills,bill_name__lob_bills
0,s1461-117,2820018,117,S.14
1,hr463-117,2820018,117,H.R.4
2,s910-116,2820035,116,S.9


#### lob_indus

In [215]:
# OpenSecrets Data Definition for Lobbying Data: Lobby industries
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_indus.htm
columns_lob_indus = ['client', 'sub', 'total', 'year', 'catcode']

if not os.path.exists('../../data/open_secrets/Lobby/lob_indus.csv'):
    process_data('../../data/open_secrets/Lobby/lob_indus.txt', headers=columns_lob_indus, n_expected_fields=len(columns_lob_indus), show_errs=False)

df_lob_indus = pd.read_csv('../../data/open_secrets/Lobby/lob_indus.csv', on_bad_lines='skip')

In [216]:
showdf(df_lob_indus)

Unnamed: 0,client__lob_indus,sub__lob_indus,total__lob_indus,year__lob_indus,catcode__lob_indus
0,National Assn for County Community & Econ Develop,National Assn for County Community & Econ Develop,0,1998,X3000
1,Fox Valley Technical College,Fox Valley Technical College,80000,2015,H5200
2,Employers Cncl on Flexible Compensation,Employers Cncl on Flexible Compensation,580000,2001,J9000


#### lob_issue

In [218]:
# OpenSecrets Data Definition for Lobbying Data: Lobby issues
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_issues.htm
columns_lob_issue = ['si_id', 'uniqid', 'issueid', 'issue', 'specificissue', 'year']

if not os.path.exists('../../data/open_secrets/Lobby/lob_issue.csv'):
    process_data('../../data/open_secrets/Lobby/lob_issue.txt', headers=columns_lob_issue, n_expected_fields=len(columns_lob_issue), show_errs=False)

df_lob_issue = pd.read_csv('../../data/open_secrets/Lobby/lob_issue.csv', on_bad_lines='skip')

In [219]:
showdf(df_lob_issue)

Unnamed: 0,si_id__lob_issue,uniqid__lob_issue,issueid__lob_issue,issue__lob_issue,specificissue__lob_issue,year__lob_issue
0,3001624,02e92bd6-0159-495e-9d00-8a490a0be8be,DIS,Disaster & Emergency Planning,Issues affecting manufacturer of railroad and ...,2022
1,3001625,02e92bd6-0159-495e-9d00-8a490a0be8be,ENV,Environment & Superfund,Issues affecting manufacturer of railroad and ...,2022
2,3001626,02e92bd6-0159-495e-9d00-8a490a0be8be,LBR,"Labor, Antitrust & Workplace",Issues affecting manufacturer of railroad and ...,2022


#### lob_issue_no_specific

In [221]:
# OpenSecrets Data Definition for Lobbying Data: Lobby issues (no specific issue)
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_issues.htm
columns_lob_issue_no_specific = ['si_id', 'uniqid', 'issueid', 'issue', 'year']

if not os.path.exists('../../data/open_secrets/Lobby/lob_issue_NoSpecficIssue.csv'):
    process_data('../../data/open_secrets/Lobby/lob_issue_NoSpecficIssue.txt', headers=columns_lob_issue_no_specific, n_expected_fields=len(columns_lob_issue_no_specific), show_errs=False)

df_lob_issue_no_specific = pd.read_csv('../../data/open_secrets/Lobby/lob_issue_NoSpecficIssue.csv', on_bad_lines='skip')

In [222]:
showdf(df_lob_issue_no_specific)

Unnamed: 0,si_id__lob_issue_NoSpecficIssue,uniqid__lob_issue_NoSpecficIssue,issueid__lob_issue_NoSpecficIssue,issue__lob_issue_NoSpecficIssue,year__lob_issue_NoSpecficIssue
0,3001624,02e92bd6-0159-495e-9d00-8a490a0be8be,DIS,Disaster & Emergency Planning,2022
1,3001625,02e92bd6-0159-495e-9d00-8a490a0be8be,ENV,Environment & Superfund,2022
2,3001626,02e92bd6-0159-495e-9d00-8a490a0be8be,LBR,"Labor, Antitrust & Workplace",2022


#### lob_lobbying

In [224]:
# OpenSecrets Data Definitions for Lobbying Data: Lobbying
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_lobbying.htm
columns_lob_lobbying = ['uniqid','registrant_raw','registrant','isfirm','client_raw','client','ultorg','amount',
                        'catcode','source','self','includensfs','use',
                       'ind', 'year', 'type', 'typelong', 'affiliate']

if not os.path.exists('../../data/open_secrets/Lobby/lob_lobbying.csv'):
    process_data('../../data/open_secrets/Lobby/lob_lobbying.txt', headers=columns_lob_lobbying, n_expected_fields=len(columns_lob_lobbying), show_errs=False)

df_lob_lobbying = pd.read_csv('../../data/open_secrets/Lobby/lob_lobbying.csv', on_bad_lines='skip')

In [225]:
showdf(df_lob_lobbying)

Unnamed: 0,uniqid__lob_lobbying,registrant_raw__lob_lobbying,registrant__lob_lobbying,isfirm__lob_lobbying,client_raw__lob_lobbying,client__lob_lobbying,ultorg__lob_lobbying,amount__lob_lobbying,catcode__lob_lobbying,source__lob_lobbying,self__lob_lobbying,includensfs__lob_lobbying,use__lob_lobbying,ind__lob_lobbying,year__lob_lobbying,type__lob_lobbying,typelong__lob_lobbying,affiliate__lob_lobbying
0,82c5f661-a637-45ad-a3a6-b5ba18cf8962,ASTRAZENECA PHARMACEUTICALS LP,AstraZeneca Pharmaceuticals,,ASTRAZENECA PHARMACEUTICALS LP,AstraZeneca Pharmaceuticals,AstraZeneca PLC,1370000.0,H4300,pac,x,,y,y,2021,q4a,FOURTH QUARTER AMENDMENT,
1,84ad3a9e-5864-4227-a802-e268fbf37237,"DAVID L. HORNE, LLC",David L Horne LLC,y,MULTIFAMILY LENDERS COUNCIL,Multifamily Lenders Council,Multifamily Lenders Council,15000.0,F4600,wda16,n,,y,y,2021,q4,FOURTH QUARTER REPORT,
2,85b111b1-5d2e-4107-bc24-0921316e29a5,ECHELON GOVERNMENT AFFAIRS,Echelon Government Affairs,y,THE ALBERS GROUP,Albers Group,Albers Group,10000.0,Y4000,,n,,y,y,2021,q4,FOURTH QUARTER REPORT,


#### lob_lobbyist

In [227]:
# OpenSecrets Data Definition for Lobbyists
# https://www.opensecrets.org/resources/datadictionary/Data%20Dictionary%20lob_lobbyists.htm
columns_lob_lobbyist = ['uniqid', 'lobbyist_lastname_std', 'lobbyist_firstname_std', 'lobbyist_lastname_raw', 
                     'lobbyist_firstname_raw', 'lobbyist_id', 'year', 'officialposition', 'cid', 'formercongmem']

if not os.path.exists('../../data/open_secrets/Lobby/lob_lobbyist.csv'):
    process_data('../../data/open_secrets/Lobby/lob_lobbyist.txt', headers=columns_lob_lobbyist, n_expected_fields=len(columns_lob_lobbyist), show_errs=False)

df_lob_lobbyist = pd.read_csv('../../data/open_secrets/Lobby/lob_lobbyist.csv', on_bad_lines='skip')

In [228]:
showdf(df_lob_lobbyist)

Unnamed: 0,uniqid__lob_lobbyist,lobbyist_lastname_std__lob_lobbyist,lobbyist_firstname_std__lob_lobbyist,lobbyist_lastname_raw__lob_lobbyist,lobbyist_firstname_raw__lob_lobbyist,lobbyist_id__lob_lobbyist,year__lob_lobbyist,officialposition__lob_lobbyist,cid__lob_lobbyist,formercongmem__lob_lobbyist
0,06C29C84-250F-478B-872A-2F647D9DC044,"O'BRIEN, LAWRENCE F. III","O'Brien, Lawrence F III",Y0000046486L,2004,,,n,,
1,3A22C685-EC94-46AA-9C45-4AA4A7044C28,"BRAGG, PATRICIA DUNMIRE","Bragg, Patricia Dunmire",Y0000020554L,2001,,,n,,
2,5CBE61EC-87F1-401E-9D57-620975C9A1F8,"COSTELLO, RYAN","Costello, Ryan",Y0000027292L,2002,,N00031064,y,,


#### lob_rpt

In [230]:
# OpenSecrets Data Definitions for Lobbying Data: Report types
# No documentation provided on OpenSecrets.com
columns_lob_rpt = ['typelong', 'typeshort']

if not os.path.exists('../../data/open_secrets/Lobby/lob_rpt.csv'):
    process_data('../../data/open_secrets/Lobby/lob_rpt.txt', headers=columns_lob_rpt, n_expected_fields=len(columns_lob_rpt), show_errs=False)

df_lob_rpt = pd.read_csv('../../data/open_secrets/Lobby/lob_rpt.csv', on_bad_lines='skip')

In [231]:
showdf(df_lob_rpt)

Unnamed: 0,typelong__lob_rpt,typeshort__lob_rpt
0,MID-YEAR REPORT,m
1,MID-YEAR AMENDMENT,ma
2,MID-YEAR (NO ACTIVITY),mn


### IDs and categories

#### CRP_ID

In [234]:
install_if_needed('xlrd')
import xlrd

xlrd is already installed.


In [235]:
# Candidate ids
# This dataset is very different, so load it independently.
columns_crp_ids = ['blank_excel_column__crp_ids', 'cid__crp_ids', 'crpname__crp_ids', 'party__crp_ids', 'distidrunfor__crp_ids', 'feccandid__crp_ids'] # Blank excel column is necessary.
columns_crp_ids = dict(enumerate(columns_crp_ids))
df_crp_ids = pd.read_excel('../../data/open_secrets/CRP_IDs.xls', header=None, skiprows=15)
df_crp_ids = df_crp_ids.drop(df_crp_ids.columns[0], axis=1)
df_crp_ids = df_crp_ids.rename(columns=columns_crp_ids)

In [236]:
showdf(df_crp_ids)

Unnamed: 0,cid__crp_ids,crpname__crp_ids,party__crp_ids,distidrunfor__crp_ids,feccandid__crp_ids
0,N00034296,"Aalders, Tim",R,UT03,H2UT03280
1,N00047923,"Aazami, Shervin",D,CA32,H2CA30291
2,N00051397,"Abahsain, Jill",D,MN07,H2MN07162


#### CRP_Categories

In [238]:
from io import StringIO
crp_filepath = '../../data/open_secrets/CRP_Categories.txt'
with open(crp_filepath, 'r') as file:
    lines = file.readlines()

header_line_index = next(i for i, line in enumerate(lines) if line.startswith('Catcode'))
table_data = ''.join(lines[header_line_index:])
df_crp_cats = pd.read_csv(StringIO(table_data), sep='\t')
df_crp_cats.columns = df_crp_cats.columns.str.lower().str.replace(' ', '_')
df_crp_cats.columns = [col + '__crp_cats' for col in df_crp_cats.columns]

In [239]:
showdf(df_crp_cats)

Unnamed: 0,catcode__crp_cats,catname__crp_cats,catorder__crp_cats,industry__crp_cats,sector__crp_cats,sector_long__crp_cats
0,A0000,Agriculture,A11,Misc Agriculture,Agribusiness,Agribusiness
1,A1000,Crop production & basic processing,A01,Crop Production & Basic Processing,Agribusiness,Agribusiness
2,A1100,Cotton,A01,Crop Production & Basic Processing,Agribusiness,Agribusiness


---
## Ways and Means
*Toy network & full network of member campaign contributions from the 2018 election, resulting in the 116th Congress*

#### Dataframe of committee members

In [242]:
wm_dem_members = []
with open('../../data/wm_members_dem.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    for row in reader:
        wm_dem_members.append(row)

df_wm_dem_members = df_cands18[df_cands18['firstlast__cands18'].isin(wm_dem_members[0])]

In [243]:
wm_rep_members = []
with open('../../data/wm_members_rep.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    for row in reader:
        wm_rep_members.append(row)

df_wm_rep_members = df_cands18[df_cands18['firstlast__cands18'].isin(wm_rep_members[0])]

In [244]:
df_wm_members = pd.concat([df_wm_dem_members, df_wm_rep_members])
df_wm_members = df_wm_members.drop_duplicates(subset='firstlast__cands18', keep='first') # Some are duplicates, safe to remove.

In [245]:
showdf(df_wm_members)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18
6,2018,H0AL07086,N00030622,Terri A Sewell (D),D,AL07,AL07,Y,Y,I,DW,,Terri A Sewell
63,2018,H0CA32101,N00030600,Judy Chu (D),D,CA27,CA27,Y,Y,I,DW,,Judy Chu
548,2018,H0WA08046,N00030693,Suzan DelBene (D),D,WA01,WA01,Y,Y,I,DW,,Suzan DelBene


#### Pre-defined members for toy network

Richard E Neal, chairman

In [248]:
member_cid_1 = 'N00000153'

Kevin Brady, ranking member

In [250]:
member_cid_2 = 'N00005883'

#### Network initialization

In [288]:
df_toy_network = df_wm_members[(df_wm_members['cid__cands18'] == member_cid_1) | (df_wm_members['cid__cands18'] == member_cid_2)] # Limiting to the two members above.
df_full_network = df_wm_members.copy()

In [290]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18
3599,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal
2395,2018,H6TX08100,N00005883,Kevin Brady (R),R,TX08,TX08,Y,Y,I,RW,,Kevin Brady


#### Candidate pacs
Extract candidate pacs/cmtes from the others and augment column names.

In [293]:
df_cand_cmtes18 = df_cmtes18[df_cmtes18['recipid__cmtes18'].str.startswith('N')] # Only candidate committees.
df_cand_cmtes18.columns = df_cand_cmtes18.columns.str.replace(r"(cmtes18)", r"cand_\1", regex=True)

#### Non-candidate pacs
While we're at it, extract all the non-partisan/candidate pacs and augment column names for later steps.

In [296]:
df_noncand_cmtes18 = df_cmtes18[df_cmtes18['party__cmtes18'].isna()] # Excludes party, joint fundraising, leadership, or candidate committees.
df_noncand_cmtes18['sensitive__cmtes18'] = df_noncand_cmtes18['sensitive__cmtes18'].apply(lambda x: x.upper() if isinstance(x, str) else x)
df_noncand_cmtes18.columns = df_noncand_cmtes18.columns.str.replace(r"(cmtes18)", r"noncand_\1", regex=True)

#### Join candidates and candidate pacs

In [299]:
df_toy_network = pd.merge(df_toy_network, df_cand_cmtes18, left_on='cid__cands18', right_on='recipid__cand_cmtes18', how='inner')
df_full_network = pd.merge(df_full_network, df_cand_cmtes18, left_on='cid__cands18', right_on='recipid__cand_cmtes18', how='inner')

In [301]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18,cycle__cand_cmtes18,cmteid__cand_cmtes18,pacshort__cand_cmtes18,affiliate__cand_cmtes18,ultorg__cand_cmtes18,recipid__cand_cmtes18,recipcode__cand_cmtes18,feccandid__cand_cmtes18,party__cand_cmtes18,primcode__cand_cmtes18,source__cand_cmtes18,sensitive__cand_cmtes18,foreign__cand_cmtes18,active__cand_cmtes18
0,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1
1,2018,H6TX08100,N00005883,Kevin Brady (R),R,TX08,TX08,Y,Y,I,RW,,Kevin Brady,2018,C00311043,Brady for Congress,,Brady for Congress,N00005883,RW,H6TX08100,R,Z1100,Rept,N,0,1


#### Join inflows for each candidate pac

In [304]:
df_toy_network = pd.merge(df_toy_network, df_pacs18, left_on='cid__cands18', right_on='cid__pacs18', how='inner')
df_full_network = pd.merge(df_full_network, df_pacs18, left_on='cid__cands18', right_on='cid__pacs18', how='inner')

In [306]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18,cycle__cand_cmtes18,cmteid__cand_cmtes18,pacshort__cand_cmtes18,affiliate__cand_cmtes18,ultorg__cand_cmtes18,recipid__cand_cmtes18,recipcode__cand_cmtes18,feccandid__cand_cmtes18,party__cand_cmtes18,primcode__cand_cmtes18,source__cand_cmtes18,sensitive__cand_cmtes18,foreign__cand_cmtes18,active__cand_cmtes18,cycle__pacs18,fecrecno__pacs18,pacid__pacs18,cid__pacs18,amount__pacs18,date__pacs18,realcode__pacs18,type__pacs18,di__pacs18,feccandid__pacs18
0,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4051920171405513361,C00284885,N00000153,2500,04/30/2017,G4500,24K,D,H8MA02041
1,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4061420171409897661,C00072769,N00000153,1000,05/22/2017,H4300,24K,D,H8MA02041
2,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4080320181582420638,C00194746,N00000153,1000,07/13/2018,F3200,24K,D,H8MA02041


#### Join sources of inflows

In [309]:
df_toy_network = pd.merge(df_toy_network, df_noncand_cmtes18, left_on='pacid__pacs18', right_on='cmteid__noncand_cmtes18', how='inner')
df_full_network = pd.merge(df_full_network, df_noncand_cmtes18, left_on='pacid__pacs18', right_on='cmteid__noncand_cmtes18', how='inner')

In [311]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18,cycle__cand_cmtes18,cmteid__cand_cmtes18,pacshort__cand_cmtes18,affiliate__cand_cmtes18,ultorg__cand_cmtes18,recipid__cand_cmtes18,recipcode__cand_cmtes18,feccandid__cand_cmtes18,party__cand_cmtes18,primcode__cand_cmtes18,source__cand_cmtes18,sensitive__cand_cmtes18,foreign__cand_cmtes18,active__cand_cmtes18,cycle__pacs18,fecrecno__pacs18,pacid__pacs18,cid__pacs18,amount__pacs18,date__pacs18,realcode__pacs18,type__pacs18,di__pacs18,feccandid__pacs18,cycle__noncand_cmtes18,cmteid__noncand_cmtes18,pacshort__noncand_cmtes18,affiliate__noncand_cmtes18,ultorg__noncand_cmtes18,recipid__noncand_cmtes18,recipcode__noncand_cmtes18,feccandid__noncand_cmtes18,party__noncand_cmtes18,primcode__noncand_cmtes18,source__noncand_cmtes18,sensitive__noncand_cmtes18,foreign__noncand_cmtes18,active__noncand_cmtes18
0,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4051920171405513361,C00284885,N00000153,2500,04/30/2017,G4500,24K,D,H8MA02041,2018,C00284885,Home Depot,,Home Depot,C00284885,PB,,,G4500,Hoovers,N,0,1
1,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4061420171409897661,C00072769,N00000153,1000,05/22/2017,H4300,24K,D,H8MA02041,2018,C00072769,Hoffmann-La Roche,Roche Holdings,Roche Holdings,C00072769,PB,,,H4300,Hvr08,N,1,1
2,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4080320181582420638,C00194746,N00000153,1000,07/13/2018,F3200,24K,D,H8MA02041,2018,C00194746,Blue Cross & Blue Shield Assn,Blue Cross/Blue Shield,Blue Cross/Blue Shield,C00194746,PB,,,F3200,AFP90,N,0,1


#### Join source's industry category codes

In [314]:
df_toy_network = pd.merge(df_toy_network, df_crp_cats, left_on='primcode__noncand_cmtes18', right_on='catcode__crp_cats', how='inner')
df_full_network = pd.merge(df_full_network, df_crp_cats, left_on='primcode__noncand_cmtes18', right_on='catcode__crp_cats', how='inner')

In [316]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18,cycle__cand_cmtes18,cmteid__cand_cmtes18,pacshort__cand_cmtes18,affiliate__cand_cmtes18,ultorg__cand_cmtes18,recipid__cand_cmtes18,recipcode__cand_cmtes18,feccandid__cand_cmtes18,party__cand_cmtes18,primcode__cand_cmtes18,source__cand_cmtes18,sensitive__cand_cmtes18,foreign__cand_cmtes18,active__cand_cmtes18,cycle__pacs18,fecrecno__pacs18,pacid__pacs18,cid__pacs18,amount__pacs18,date__pacs18,realcode__pacs18,type__pacs18,di__pacs18,feccandid__pacs18,cycle__noncand_cmtes18,cmteid__noncand_cmtes18,pacshort__noncand_cmtes18,affiliate__noncand_cmtes18,ultorg__noncand_cmtes18,recipid__noncand_cmtes18,recipcode__noncand_cmtes18,feccandid__noncand_cmtes18,party__noncand_cmtes18,primcode__noncand_cmtes18,source__noncand_cmtes18,sensitive__noncand_cmtes18,foreign__noncand_cmtes18,active__noncand_cmtes18,catcode__crp_cats,catname__crp_cats,catorder__crp_cats,industry__crp_cats,sector__crp_cats,sector_long__crp_cats
0,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4051920171405513361,C00284885,N00000153,2500,04/30/2017,G4500,24K,D,H8MA02041,2018,C00284885,Home Depot,,Home Depot,C00284885,PB,,,G4500,Hoovers,N,0,1,G4500,Hardware & building materials stores,N03,Retail Sales,Misc Business,Misc Business
1,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4061420171409897661,C00072769,N00000153,1000,05/22/2017,H4300,24K,D,H8MA02041,2018,C00072769,Hoffmann-La Roche,Roche Holdings,Roche Holdings,C00072769,PB,,,H4300,Hvr08,N,1,1,H4300,Pharmaceutical manufacturing,H04,Pharmaceuticals/Health Products,Health,Health
2,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4080320181582420638,C00194746,N00000153,1000,07/13/2018,F3200,24K,D,H8MA02041,2018,C00194746,Blue Cross & Blue Shield Assn,Blue Cross/Blue Shield,Blue Cross/Blue Shield,C00194746,PB,,,F3200,AFP90,N,0,1,F3200,Accident & health insurance,F09,Insurance,Finance/Insur/RealEst,"Finance, Insurance & Real Estate"


#### Join industry details of category codes

In [319]:
df_lob_indus_2018 = df_lob_indus[df_lob_indus['year__lob_indus'] == 2018]

In [321]:
df_toy_network = pd.merge(df_toy_network, df_lob_indus_2018, left_on='ultorg__noncand_cmtes18', right_on='client__lob_indus', how='left')
df_full_network = pd.merge(df_full_network, df_lob_indus_2018, left_on='ultorg__noncand_cmtes18', right_on='client__lob_indus', how='left')

In [323]:
showdf(df_toy_network)

Unnamed: 0,cycle__cands18,feccandid__cands18,cid__cands18,firstlastp__cands18,party__cands18,distidrunfor__cands18,distidcurr__cands18,currcand__cands18,cyclecand__cands18,crpico__cands18,recipcode__cands18,nopacs__cands18,firstlast__cands18,cycle__cand_cmtes18,cmteid__cand_cmtes18,pacshort__cand_cmtes18,affiliate__cand_cmtes18,ultorg__cand_cmtes18,recipid__cand_cmtes18,recipcode__cand_cmtes18,feccandid__cand_cmtes18,party__cand_cmtes18,primcode__cand_cmtes18,source__cand_cmtes18,sensitive__cand_cmtes18,foreign__cand_cmtes18,active__cand_cmtes18,cycle__pacs18,fecrecno__pacs18,pacid__pacs18,cid__pacs18,amount__pacs18,date__pacs18,realcode__pacs18,type__pacs18,di__pacs18,feccandid__pacs18,cycle__noncand_cmtes18,cmteid__noncand_cmtes18,pacshort__noncand_cmtes18,affiliate__noncand_cmtes18,ultorg__noncand_cmtes18,recipid__noncand_cmtes18,recipcode__noncand_cmtes18,feccandid__noncand_cmtes18,party__noncand_cmtes18,primcode__noncand_cmtes18,source__noncand_cmtes18,sensitive__noncand_cmtes18,foreign__noncand_cmtes18,active__noncand_cmtes18,catcode__crp_cats,catname__crp_cats,catorder__crp_cats,industry__crp_cats,sector__crp_cats,sector_long__crp_cats,client__lob_indus,sub__lob_indus,total__lob_indus,year__lob_indus,catcode__lob_indus
0,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4051920171405513361,C00284885,N00000153,2500,04/30/2017,G4500,24K,D,H8MA02041,2018,C00284885,Home Depot,,Home Depot,C00284885,PB,,,G4500,Hoovers,N,0,1,G4500,Hardware & building materials stores,N03,Retail Sales,Misc Business,Misc Business,Home Depot,Home Depot,1850000.0,2018.0,G4500
1,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4061420171409897661,C00072769,N00000153,1000,05/22/2017,H4300,24K,D,H8MA02041,2018,C00072769,Hoffmann-La Roche,Roche Holdings,Roche Holdings,C00072769,PB,,,H4300,Hvr08,N,1,1,H4300,Pharmaceutical manufacturing,H04,Pharmaceuticals/Health Products,Health,Health,Roche Holdings,Hoffmann-La Roche,572279.0,2018.0,H4300
2,2018,H8MA02041,N00000153,Richard E Neal (D),D,MA01,MA01,Y,Y,I,DW,,Richard E Neal,2018,C00226522,Richard E Neal for Congress Cmte,,Richard E Neal for Congress Cmte,N00000153,DW,H8MA02041,D,Z1200,Rept,N,0,1,2018,4061420171409897661,C00072769,N00000153,1000,05/22/2017,H4300,24K,D,H8MA02041,2018,C00072769,Hoffmann-La Roche,Roche Holdings,Roche Holdings,C00072769,PB,,,H4300,Hvr08,N,1,1,H4300,Pharmaceutical manufacturing,H04,Pharmaceuticals/Health Products,Health,Health,Roche Holdings,Roche Diagnostics,350000.0,2018.0,H4100


#### Final cleanup

In [326]:
drop_columns = ['cycle__cands18', 'feccandid__cands18', 'firstlastp__cands18', 'distidrunfor__cands18', 'distidcurr__cands18',
                'currcand__cands18', 'cyclecand__cands18', 'recipcode__cands18', 'nopacs__cands18', 'cycle__cand_cmtes18', 'pacshort__cand_cmtes18',
                'affiliate__cand_cmtes18', 'recipid__cand_cmtes18', 'recipcode__cand_cmtes18',
                'feccandid__cand_cmtes18', 'party__cand_cmtes18', 'primcode__cand_cmtes18', 'source__cand_cmtes18',
                'sensitive__cand_cmtes18', 'foreign__cand_cmtes18', 'active__cand_cmtes18', 'cycle__pacs18', 'fecrecno__pacs18',
                'pacid__pacs18', 'cid__pacs18', 'realcode__pacs18', 'type__pacs18', 'di__pacs18',
                'feccandid__pacs18', 'cycle__noncand_cmtes18', 'pacshort__noncand_cmtes18', 'affiliate__noncand_cmtes18', 'recipid__noncand_cmtes18', 
                'feccandid__noncand_cmtes18', 'party__noncand_cmtes18', 'primcode__noncand_cmtes18', 
                'source__noncand_cmtes18', 'active__noncand_cmtes18', 'catcode__crp_cats', 'catorder__crp_cats', 'year__lob_indus']

df_toy_network = df_toy_network.drop(drop_columns, axis=1)
df_full_network = df_full_network.drop(drop_columns, axis=1)

In [328]:
showdf(df_toy_network)

Unnamed: 0,cid__cands18,party__cands18,crpico__cands18,firstlast__cands18,cmteid__cand_cmtes18,ultorg__cand_cmtes18,amount__pacs18,date__pacs18,cmteid__noncand_cmtes18,affiliate__noncand_cmtes18,ultorg__noncand_cmtes18,recipcode__noncand_cmtes18,sensitive__noncand_cmtes18,foreign__noncand_cmtes18,catname__crp_cats,industry__crp_cats,sector__crp_cats,sector_long__crp_cats,client__lob_indus,sub__lob_indus,total__lob_indus,catcode__lob_indus
0,N00000153,D,I,Richard E Neal,C00226522,Richard E Neal for Congress Cmte,2500,04/30/2017,C00284885,,Home Depot,PB,N,0,Hardware & building materials stores,Retail Sales,Misc Business,Misc Business,Home Depot,Home Depot,1850000.0,G4500
1,N00000153,D,I,Richard E Neal,C00226522,Richard E Neal for Congress Cmte,1000,05/22/2017,C00072769,Roche Holdings,Roche Holdings,PB,N,1,Pharmaceutical manufacturing,Pharmaceuticals/Health Products,Health,Health,Roche Holdings,Hoffmann-La Roche,572279.0,H4300
2,N00000153,D,I,Richard E Neal,C00226522,Richard E Neal for Congress Cmte,1000,05/22/2017,C00072769,Roche Holdings,Roche Holdings,PB,N,1,Pharmaceutical manufacturing,Pharmaceuticals/Health Products,Health,Health,Roche Holdings,Roche Diagnostics,350000.0,H4100


In [330]:
to_csv(df_toy_network)

DataFrame saved as CSV file in ../../outputs/df_toy_network.csv


In [332]:
to_csv(df_full_network)

DataFrame saved as CSV file in ../../outputs/df_full_network.csv


---

#### PAC Research

**Richard Neal Research**

In [None]:
# Richard's pacs, all of them, by using the "feccandid__cmtes18" field.
# Looks like there's only one, unlike for Kevin.
df_cmtes18[df_cmtes18['feccandid__cmtes18'] == 'H8MA02041']

In [None]:
# Richard's lead pac.
showdf(df_cmtes18[df_cmtes18['cmteid__cmtes18'] == member_pacid_1])

In [None]:
# Richard has no victory pac to give money to his lead pac, in part.
# But there is a row showing another pac giving to Richard's lead pac, Hartford Financial Services.
showdf(df_pac_other18[df_pac_other18['recipid__pac_other18'] == member_cid_1])

In [None]:
# All those other roads giving to Richard, EXCEPT his victory fund pac
showdf(df_pac_other18[(df_pac_other18['recipid__pac_other18'] == member_cid_1) & (df_pac_other18['filerid__pac_other18'] != 'C00226522')])

---

**Kevin Brady Research**

In [None]:
# Kevin's pacs, all of them, by using the "feccandid__cmtes18" field.
# There's a normal lead pac, and then there's a victory pac.
showdf(df_cmtes18[df_cmtes18['feccandid__cmtes18'] == 'H6TX08100'])

In [None]:
# Kevin's lead pac.
showdf(df_cmtes18[df_cmtes18['cmteid__cmtes18'] == member_pacid_2])

In [None]:
# Kevin's victory pac that donates to Kevin's lead pac, in part.
showdf(df_cmtes18[df_cmtes18['cmteid__cmtes18'] == 'C00531285'])

In [None]:
# First row: Kevin's other pac gives to Kevin's lead pac.
# Other rows show other pacs giving to Kevin directly (not thru his lead pac)
showdf(df_pac_other18[df_pac_other18['recipid__pac_other18'] == member_cid_2])

In [None]:
# All those other roads giving to Kevin, EXCEPT his victory fund pac
showdf(df_pac_other18[(df_pac_other18['recipid__pac_other18'] == member_cid_2) & (df_pac_other18['filerid__pac_other18'] != 'C00531285')])

In [None]:
# Pac inflows to Kevin
# HOW DOES THIS RELATE TO OTHER?
showdf(df_pacs18[df_pacs18['cid__pacs18'] == member_cid_2])