Skip to content

Commit

Permalink
Merge pull request #1 from dwillis/master
Browse files Browse the repository at this point in the history
FTP bulk data additions and other stuff
  • Loading branch information
bycoffe committed Aug 2, 2012
2 parents 362ed23 + 611f015 commit 40f41db
Show file tree
Hide file tree
Showing 10 changed files with 83 additions and 2 deletions.
1 change: 1 addition & 0 deletions README.md
@@ -1,3 +1,4 @@
A developer's guide to FEC data
=============

This document is an attempt to describe the range of federal campaign finance data provided by the [Federal Election Commission](http://fec.gov/) and its usage by developers. It assumes no prior knowledge of campaign finance law or practice, but some familiarity with the U.S. federal electoral system, such as the difference between the House and Senate.
29 changes: 29 additions & 0 deletions bulk/detailed.md
@@ -0,0 +1,29 @@
Detailed files
========

The [detailed files](http://www.fec.gov/finance/disclosure/ftpdet.shtml) include cycle-specific data for committees, candidates, individual contributions and committee transactions to candidates and between committees. Each cycle, beginning with 1979-80, has five files with data representing that cycle. The current cycle and its immediate 2-3 preceding cycles also have three smaller tables that represent additions, changes and deletions to the individual contributions file. The FEC changed the format of these files from fixed-width to pipe-delimited in July 2012. These files are updated weekly on late Sunday evenings/early Monday mornings, so information submitted during the week before should be reflected in the next update. One advantage of using these files is that they represent "official" data that has been checked by the FEC. Nonetheless, there have been some errors

Committees
---------

Known as the [committee master file](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteeMaster.shtml), this has a record for each committee registered with the FEC during a cycle. The ID number assigned by the FEC to each committee is unique within the cycle, but committees often exist for years or even decades. If a committee is a campaign committee for a candidate for the House, Senate or President, it will include that candidate's ID number. A committee will be one of [several types](http://www.fec.gov/finance/disclosure/metadata/CommitteeTypeCodes.shtml). Other committees may have values for party affiliation, frequency of filings and connected organization, although these values are not universally present for each committee.

Candidates
---------

Known as the [candidate master file](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCandidateMaster.shtml), this file contains one row for each candidate who either registered with the FEC or "appeared on a ballot list prepared by a state elections office." Key fields in this file include whether a candidate is an incumbent, challenger or seeking an open seat (although this field is not consistently populated), and the state and district the candidate is running in.

Candidate Committee Linkage
---------

A new offering as of July 2012, [this file](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCandCmteLinkage.shtml) contains one row for each combination of candidate and committee during an election cycle. Candidates can have a primary committee, authorized committees, joint fundraising committees and other relationships. The linkage file helps keep track of all of those relationships. In particular, joint fundraising committees can benefit multiple candidates; previously the committee master file would only list one. Note: it does not include a link between candidates and their leadership committees.

Committee Contributions To Candidates
---------

[This file](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryContributionstoCandidates.shtml) contains one row for each contribution from a committee to a candidate and for each independent expenditure made by a committee about a candidate. It is a subset of the file containing all transactions between one committee and another. This file can be used to calculate things such as which candidate got the most PAC money and which candidates have received contributions from a specific committee.

Any Transaction from One Committee to Another
---------

[This file](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteetoCommittee.shtml) contains a row for each transaction between two committees, regardless of whether either committee is a candidate committee or not. For example, this file includes contributions from a corporate PAC to a national party committee in addition to contributions to candidate committees. The list of [transaction types](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryTransactionTypeCodes.shtml) describes the kind of transaction each row represents. The quirk here is that in many cases, both sides of the transaction get separate records: one for the committee making the contribution (usually beginning with 2X) and another for the committee receiving the contribution (usually beginning with 1X). Using this file for SQL queries means including transaction type in some manner in pretty much every query; which types you want depends on a couple of factors, such as timeliness (contributing committees can file before recipient committees) or completeness (a recipient committee usually is the definitive account of what it has gotten).
16 changes: 16 additions & 0 deletions bulk/summary.md
@@ -0,0 +1,16 @@
Summary files
========

The [summary files](http://www.fec.gov/finance/disclosure/ftpsum.shtml) include canonical election-cycle data for candidates and committees - one record for each candidate or committee per two-year election cycle, depending on the file selected. There are two candidate summary files -- one for campaigns that have elections in the current cycle, and one for all candidates no matter if they face election in the cycle or not -- and they can differ in amounts and timeliness.

The [current campaigns file](ftp://ftp.fec.gov/FEC/webl12.zip) may be more timely, but also contains a single total for PAC contributions (compared to totals for different kinds of PACs in the other file) and some of its totals may contain double-counted transactions. More at the [data dictionary](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryWEBL.shtml).

The [all candidates file](ftp://ftp.fec.gov/FEC/weball12.zip) can be updated slightly less frequently than the current campaigns one, but it contains more detailed breakdowns of certain types of transactions as noted above. The possibility of double-counting some kinds of transactions - transfers to and from authorized committees of a candidate - also exists. More at the [data dictionary](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryWEBALL.shtml).

The [PAC summary file](ftp://ftp.fec.gov/FEC/2012/webk12.zip) provides the latest summary information on political action and party committees, including independent expenditures. More at the [data dictionary](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryWEBK.shtml).

The FEC previously used to generate summary files at the end of the election cycle for candidates and PACs that included totals for different types of PACs, but discontinued these files after the 2005-06 cycle. Files are available from 1979-80 through 2005-06. Party committee-only summary files exist from the 1991-92 cycle through the 2003-04 cycle. One of the most useful files, which contained a record for each combination of candidate recipient and PAC contributor/independent spender, covers the 1991-92 cycle through the 2001-02 cycle. A similar file exists for candidate-party activities during the same time period.

These summary files had been stored in fixed-width format but were converted to pipe-delimited format in late July 2012.

There's one more thing to be aware of: the way that the FEC used to store its data (detailed and summary) relied on the use of [an "overpunch" character](http://www.fec.gov/finance/disclosure/ftpsum.shtml#overpunch) to represent negative amounts. This [recently changed for the detailed contribution files](http://www.fec.gov/blog/disclosure/entry/indiv_oth_and_pas2_file) and for the candidate and committee files, but it's possible that older summary files still contain such characters, and that the amount fields should be imported as text and then converted to numeric columns, accounting for negative amounts. There is a tutorial for [working with the FTP files using Microsoft Access](http://www.fec.gov/finance/disclosure/working_with_data_files.pdf).
4 changes: 2 additions & 2 deletions amendments.md → electronic-filings/amendments.md
Expand Up @@ -9,13 +9,13 @@ Once an amendment is filed, it replaces the original filing and any previous ame
Is this an amendment?
---------

In the section on __header rows__, we learned that amendments will have a value in the Report Number field of the first line of the electronic filing. The value of the Report Number represents the number of times the original filing has been amended. The easiest way to determine whether any given filing is an amendment is to check whether the Report Number field is blank. If it is, we can move on, confident that this is an original filing. If the field contains a value, we know that the filing is an amendment.
In the section on [header rows](electronic-filings/headers.md), we learned that amendments will have a value in the Report Number field of the first line of the electronic filing. The value of the Report Number represents the number of times the original filing has been amended. The easiest way to determine whether any given filing is an amendment is to check whether the Report Number field is blank. If it is, we can move on, confident that this is an original filing. If the field contains a value, we know that the filing is an amendment.


What's being amended?
---------

The section on __header rows__ also taught us that the Report ID field will contain a value on amended filings (and will be blank otherwise, just like the Report Number field):
The section on [header rows](electronic-filings/headers.md), also taught us that the Report ID field will contain a value on amended filings (and will be blank otherwise, just like the Report Number field):

> If the filing is an amendment, the report ID will look like this: *FEC-763780*
Expand Down
File renamed without changes.
File renamed without changes.
Empty file added glossary/committee.md
Empty file.
Empty file added glossary/contribution-limits.md
Empty file.
Empty file added glossary/filing.md
Empty file.
35 changes: 35 additions & 0 deletions overview.md
@@ -0,0 +1,35 @@
Overview
==========

The Federal Election Commission offers three general types of downloadable campaign finance data: individual electronic filings in CSV format covering most committees, pipe-delimited bulk itemized data and summary files via FTP from all committees covering two-year election cycles and specialized CSV or XML files via its [Data Catalog](http://fec.gov/data/DataCatalog.do?format=html). Electronic filings are available in real time as they are filed, while the other two types are updated on a regular basis.

The base element of federal campaign finance data is the [committee](glossary/committee.md). Committees are the recipient of money and the spenders of money, and there are a number of different kinds of committees. Although most FEC rules apply to all committees, some committees have different limits or rules to follow, and these matter.

The second element is the [filing](glossary/filing.md), a report by a committee to the FEC. There are many different kinds of filings, but in general they cover 3 different types of information:

1. The formation of committees and candidacies, and their details.
2. The raising of money for campaigns.
3. The spending of money on campaigns.

There is a consistent schedule for filings covering the last two types, and filings of the first type can be made at any time. In general, the FEC recognizes a roughly two-year election cycle that corresponds to U.S. House elections (where all 435 seats are up for election every 2 years). In the case of senators, who are elected to six year terms on a staggered basis, an election cycle may be considered to include the previous four years as well as the current two-year period. The election cycle is used not only as a logical boundary for calculating money raised and spent but also to calculate [contribution limits](glossary/contribution-limits.md) for candidates and committees.

Electronic Filings
========

Most committees registered with the FEC are required to file all of their reports electronically. There are two significant exceptions: committees of U.S. Senate candidates (and two senatorial party committees), which file with the Secretary of the Senate, and [committees that raise or spend less than $50,000 in a calendar year](http://fec.gov/ans/answers_filing.shtml#Do_I_need_to_file_electronically), or expect to do so. Electronic filing applies to U.S. House and presidential candidate committees, non-candidate political committees (typically knows as PACs), national, state and local political party committees registered with the FEC and individuals or organizations engaged in independent spending. Electronic filing was mandated in 2001.

Electronic filings can be found via the [FEC's search form](http://www.fec.gov/finance/disclosure/efile_search.shtml), which has a number of options for filtering the results to a particular committee, a particular date or more. The search results include the option to View or Download individual filings, which present the data in HTML and delimited (see [delimiters](electronic-filings/delimiters.md)) formats, respectively. Individual filings contain all of the records for that filing, stacked on top of each other in varying delimited layouts ([A zip file containing the formats is on FEC.gov](http://www.fec.gov/elecfil/eFilingFormats.zip).)

Even though committees file reports [on a regular schedule](http://www.fec.gov/info/report_dates.shtml), electronic filings occur nearly every day of the year. Some are amendments of previous filings, others are filed in advance (or after) a deadline, and others are filed as changes warrant. Filings that are [amendments](electronic-filings/amendments.md) are indicated in the data, and serve as complete replacements for the original filings.

Bulk FTP Data
========

The FEC has offered bulk data for years, and [its offerings](http://www.fec.gov/finance/disclosure/ftp_download.shtml) include summary and detailed files covering committees, candidates and contributions. The bulk files are updated weekly, late on Sunday nights, so depending on your timing and needs the bulk files may not be suitable for every task. The advantage of the bulk files is that they are vetted by the FEC, with some of the records standardized (by adding FEC-issued committee ids, for example) and others removed to prevent duplicate records from appearing. The [summary](bulk/summary.md) and [detailed](bulk/detailed.md) files are pipe-delimited, a relatively recent change for the FEC, and some older files may still be in fixed-width format.

Bulk data files are contained inside zip files stored on the FTP server, so retrieving them via a web application requires several steps. The FTP data is updated early Monday morning each week, and previous cycles are updated as well, since committees can amend filings from an earlier election cycle.

Data Catalog
========

The data catalog is a collection of some of the summary files available via FTP as well as other files covering disbursements, independent expenditures and leadership PACs, among other subjects. The files are available in CSV or XML formats, and cover single cycles (mostly 2010 and 2012, although the summary files also include 2008 data). One advantage of the data catalog files is that they can be called directly from a web application without having to unzip them, but there are some drawbacks. [Independent Expenditures](http://fec.gov/data/IndependentExpenditure.do?format=html&election_yr=2012) include both original transactions and amendments, resulting in duplicate records in those cases. In another example, the [listing of leadership PACs](http://fec.gov/data/Leadership.do?format=html&election_yr=2012) contains an entry for the corporate PAC of Interactive Corp. Most of the data catalog files are updated daily, and they are the one place where it's possible to find [candidate disbursements](http://fec.gov/data/CandidateDisbursement.do?format=html&election_yr=2012) in statewide or district-level files. The files themselves are stored on the FEC's FTP server, so it's possible to grab them directly. The FEC also maintains [a blog about its data](http://fec.gov/blog/) that includes changes and additions to its data offerings.

0 comments on commit 40f41db

Please sign in to comment.