Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Graph Data

branch: master
README.md

Follow the Data

Basic importers for importing FEC Campaign Finance data into the Neo4j Graph Database.

Requires

Note: that Java is just required for the initial batch import of data. The dataset can then be explored with Neo4j's own Cypher query language, or using one of the language drivers listed below.

Follow these Steps

  1. git clone https://github.com/akollegger/FEC_GRAPH.git
  2. cd FEC_GRAPH
  3. ant initialize
  4. ant
  5. ./bin/fec2graph --force --importer=[RAW|CONNECTED|RELATED|LIMITED]
    • choose one of the importers, like ./bin/fec2graph --force --importer=RAW
    • RAW: imports records with no modifications
    • CONNECTED: connects imported records based on cross-referenced IDs
    • RELATED: replaces "join table" records with graph relationships
    • LIMITED: only imports 2012 presidential candidates for a smaller dataset
  6. ant neo4j-start

Indexed Nodes

  • candidates.CAND_ID
  • candidates.CAND_NAME
  • committees.CMTE_ID
  • committees.CMTE_NM

Sample queries using indexes:

start cand=node:candidates(CAND_ID='P80003627') start comm=node:candidates(CMTE_ID='C90012980')

Cypher Challenge

Query for this...

// All presidential candidates for 2012

// Most mythical presidential candidate

// Top 10 Presidential candidates according to number of campaign committees

// find President Barack Obama

// lookup Obama by his candidate ID

// find Presidential Candidate Mitt Romney

// lookup Romney by his candidate ID

// find the shortest path of funding between Obama and Romney

// 10 top individual contributions to Obama

// 10 top individual contributions to Romney

Hint: New to all this? Here's how to identifiy one of the many fake candidates registered with the FEC.

After successfully listing all candidates for the first query, you could page through the listing to look for names that seem.. just off. Use limit and skip in the return clause to page through the long listing:

start candidate=node:candidates('CAND_ID:*') 
where candidate.CAND_OFFICE='{fill this in}' AND candidate.CAND_ELECTION_YR='{this too}' 
return candidate.CAND_NAME skip 100 limit 100;

Once you spot one of the many candidate names that isn't real, you can query for it directly:

start candidate=node:candidates(CAND_NAME:'CLAUS, SANTA')
return candidate;

To learn more about querying with Cypher, look to the excellent Neo4j Manual.

Wanna code? Get a Neo4j Driver

References

"RELATED" model

Something went wrong with that request. Please try again.