## Texas 2020

I'm going to try to wrestle with some of the data provided by the Texas Ethics Commission.

So far, what I've found is that they provide multipls CSV files dating back to 2000. Unfortunately, they provide *all* of the filings dating back to 2000. When I downloaded them I received 40 files named contrib_* ... each of them with almost 500,000 rows of data.

The other problem, is that these 'contrib' files don't have a one-to-one mapping of contributions to candidates, so I have to find in the data candidate ID (or something similar) is mapped to a candidate's name or campaign.

Instructions for deciphering the data are located in the TEC-README file, which is a copy of the readme provided by the TEC.

It looks like _cover.csv_ might be the place where I can map a filer's name to the `filerIdent` field. 

Also, check _filers.csv_ 

In [8]:
import pandas as pd

import matplotlib as pl

import mpld3 as d3

In [9]:
filers = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/filers.csv', low_memory=False)

In [10]:
filers.head()

Unnamed: 0,recordType,filerIdent,filerTypeCd,filerName,unexpendContribFilerFlag,modifiedElectCycleFlag,filerJdiCd,committeeStatusCd,ctaSeekOfficeCd,ctaSeekOfficeDistrict,...,chairMailingAddr2,chairMailingCity,chairMailingStateCd,chairMailingCountyCd,chairMailingCountryCd,chairMailingPostalCode,chairMailingRegion,chairPrimaryUsaPhoneFlag,chairPrimaryPhoneNumber,chairPrimaryPhoneExt
0,FILER,10066,COH,"Lucero, Homero R. (Mr.)",N,N,,,STATEREP,,...,,,,,,,,,,
1,FILER,10144,COH,"Criss, Lloyd W. (Mr.)",N,N,,,STATEREP,23.0,...,,,,,,,,,,
2,FILER,10191,COH,"Lee, Randy M. (Mr.)",N,N,,,STATEREP,52.0,...,,,,,,,,,,
3,FILER,10246,COH,"Herrera, Alfred R. (Mr.)",N,N,,,,,...,,,,,,,,,,
4,FILER,10616,MPAC,Citizens for the Preservation of Rural Lifesty...,N,N,,TERMINATED,,,...,,,,,,,,,,


In [11]:
covers = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/cover.csv', low_memory=False)

In [12]:
covers.head()

Unnamed: 0,recordType,formTypeCd,reportInfoIdent,receivedDt,infoOnlyFlag,filerIdent,filerTypeCd,filerName,reportTypeCd1,reportTypeCd2,...,chairMailingAddr2,chairMailingCity,chairMailingStateCd,chairMailingCountyCd,chairMailingCountryCd,chairMailingPostalCode,chairMailingRegion,chairPrimaryUsaPhoneFlag,chairPrimaryPhoneNumber,chairPrimaryPhoneExt
0,CVR1,COH,132307,20000118.0,N,10066,COH,"Lucero, Homero R. (Mr.)",FINAL,FINAL,...,,,,,,,,,,
1,CVR1,COHUC,188588,20020124.0,N,10066,COH,"Lucero, Homero R. (Mr.)",UNEXPCONT_FINAL,UNEXPCONT_FINAL,...,,,,,,,,,,
2,CVR1,COH,187537,20020115.0,N,10191,COH,"Lee, Randy M. (Mr.)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
3,CVR1,MPAC,18801,19940105.0,N,10616,MPAC,Citizens for the Preservation of Rural Lifesty...,CFJAN,CFJAN,...,,,,,,,,,,
4,CVR1,MPAC,72676,19961202.0,N,10616,MPAC,Citizens for the Preservation of Rural Lifesty...,CFDEC,CFDEC,...,,,,,,,,,,


In [13]:
covers_last_year = covers[covers.receivedDt >= 20181231.0]

In [14]:
covers_last_year

Unnamed: 0,recordType,formTypeCd,reportInfoIdent,receivedDt,infoOnlyFlag,filerIdent,filerTypeCd,filerName,reportTypeCd1,reportTypeCd2,...,chairMailingAddr2,chairMailingCity,chairMailingStateCd,chairMailingCountyCd,chairMailingCountryCd,chairMailingPostalCode,chairMailingRegion,chairPrimaryUsaPhoneFlag,chairPrimaryPhoneNumber,chairPrimaryPhoneExt
251685,CVR1,CORCOH,100263053,20190708.0,N,33149,JCOH,"Hawthorne, Teresa J. (The Honorable)",SEMIJUL,SEMIJUL,...,,,,,,,,,,
259176,CVR1,COH,100620927,20190131.0,N,80343,COH,"Smith, Demetria Y. (Ms.)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
266057,CVR1,CORCOH,100640797,20190114.0,N,80475,SCC,"Ginyard, Cynthia M. (Ms.)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
267067,CVR1,CORCOH,100643121,20190114.0,N,80475,SCC,"Ginyard, Cynthia M. (Ms.)",SEMIJUL,SEMIJUL,...,,,,,,,,,,
272094,CVR1,GPAC,100656431,20190115.0,N,54064,GPAC,Texas Stonewall Democratic Caucus,SEMIJAN,SEMIJAN,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
313628,CVR1,COH,100770162,20200112.0,N,80053,COH,"Wiley, Jay (Mr.)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
313629,CVR1,JCOH,100770163,20200112.0,N,41040,JCOH,"Leeper, Thomas A. (Mr.)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
313630,CVR1,JCOH,100770164,20200112.0,N,58587,JCOH,"Sakai, Peter A. (The Honorable)",SEMIJAN,SEMIJAN,...,,,,,,,,,,
313631,CVR1,COH,100770165,20200112.0,N,80200,COH,"Navarette, Amanda E. (The Honorable)",SEMIJAN,SEMIJAN,...,,,,,,,,,,


In [15]:
filers_last_year = pd.merge(filers, covers_last_year, on='filerIdent')

In [16]:
filers_last_year.head()

Unnamed: 0,recordType_x,filerIdent,filerTypeCd_x,filerName_x,unexpendContribFilerFlag,modifiedElectCycleFlag,filerJdiCd,committeeStatusCd,ctaSeekOfficeCd,ctaSeekOfficeDistrict,...,chairMailingAddr2_y,chairMailingCity_y,chairMailingStateCd_y,chairMailingCountyCd_y,chairMailingCountryCd_y,chairMailingPostalCode_y,chairMailingRegion_y,chairPrimaryUsaPhoneFlag_y,chairPrimaryPhoneNumber_y,chairPrimaryPhoneExt_y
0,FILER,11614,GPAC,Southwest Competitive Telecommunications Assn....,N,N,,TERMINATED,,,...,,,,,,,,,,
1,FILER,11614,GPAC,Southwest Competitive Telecommunications Assn....,N,N,,TERMINATED,,,...,,,,,,,,,,
2,FILER,11614,GPAC,Southwest Competitive Telecommunications Assn....,N,N,,TERMINATED,,,...,,,,,,,,,,
3,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,,,,,,,,,,
4,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,,,,,,,,,,


In [17]:
filers_last_year.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 14420 entries, 0 to 14419
Columns: 259 entries, recordType_x to chairPrimaryPhoneExt_y
dtypes: float64(54), int64(3), object(202)
memory usage: 28.6+ MB


In [19]:
export_filers = filers_last_year.to_csv('filtered/filers_last_year.csv', index=False, header=True)

In [43]:
c_35 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_35.csv', low_memory=False)
c_36 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_36.csv', low_memory=False)
c_37 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_37.csv', low_memory=False)
c_38 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_38.csv', low_memory=False)
c_39 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_39.csv', low_memory=False)
c_40 = pd.read_csv('tx-ethics-filings/TEC_CF_011219_CSV/contribs_40.csv', low_memory=False)

In [47]:
c_35 = c_35[c_35.receivedDt >= 20181231.0]

In [49]:
c_35_merged = pd.merge(filers_last_year, c_35, on="filerIdent")

In [50]:
c_35_merged.head()

Unnamed: 0,recordType_x,filerIdent,filerTypeCd_x,filerName_x,unexpendContribFilerFlag,modifiedElectCycleFlag,filerJdiCd,committeeStatusCd,ctaSeekOfficeCd,ctaSeekOfficeDistrict,...,contributorStreetPostalCode,contributorStreetRegion,contributorEmployer,contributorOccupation,contributorJobTitle,contributorPacFein,contributorOosPacFlag,contributorSpouseLawFirmName,contributorParent1LawFirmName,contributorParent2LawFirmName
0,FILER,20257,COH,"Lucio Jr., Eduardo A. (The Honorable)",N,N,,,STATESEN,27,...,78750,,,,,,N,,,
1,FILER,20257,COH,"Lucio Jr., Eduardo A. (The Honorable)",N,N,,,STATESEN,27,...,75205,,,,,,N,,,
2,FILER,20257,COH,"Lucio Jr., Eduardo A. (The Honorable)",N,N,,,STATESEN,27,...,75205,,"ASSOCIATIONS, INC.",CEO,,,N,,,
3,FILER,20257,COH,"Lucio Jr., Eduardo A. (The Honorable)",N,N,,,STATESEN,27,...,78503,,SELF-EMPLOYED,FRANCHISEE OWNER,,,N,,,
4,FILER,20257,COH,"Lucio Jr., Eduardo A. (The Honorable)",N,N,,,STATESEN,27,...,78504,,PATHFINDER,PRINCIPAL,,,N,,,


In [51]:
c_36_merged = pd.merge(filers_last_year, c_36, on="filerIdent")

In [53]:
c_36_merged.head()

Unnamed: 0,recordType_x,filerIdent,filerTypeCd_x,filerName_x,unexpendContribFilerFlag,modifiedElectCycleFlag,filerJdiCd,committeeStatusCd,ctaSeekOfficeCd,ctaSeekOfficeDistrict,...,contributorStreetPostalCode,contributorStreetRegion,contributorEmployer,contributorOccupation,contributorJobTitle,contributorPacFein,contributorOosPacFlag,contributorSpouseLawFirmName,contributorParent1LawFirmName,contributorParent2LawFirmName
0,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,75044,,Self,Chiropractor,,,N,,,
1,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,78746,,self,Doctor of Chiropractic,,,N,,,
2,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,77571,,Self,Chiropractor,,,N,,,
3,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,79424,,self,Chiropractor,,,N,,,
4,FILER,11832,MPAC,Texas Chiropractic Assn. PAC,N,N,,ACTIVE,,,...,77056,,Self,Doctor of Chiropractic,,,N,,,


Just making a change here so that it will update in git.