These map reduce functions use Common Crawl data to look at the spread of congressional legislation on the internet
Latest commit ed95662 Sep 18, 2012 Albert Wavering Removing build folder
Failed to load latest commit information.
bin Updated project files Sep 18, 2012
conf
dist/lib
lib
src
test/java/org/commoncrawl/hadoop/mapred
.DS_Store
README-Amazon-AMI
README.md
VERSION
build.properties
build.xml

README.md

CC-Bill-Tracker

These map reduce functions use Common Crawl data to look at the spread of congressional legislation on the internet.

Program Tasks:

  1. Count on how many pages the bill, in any of its forms, has been mentioned
  2. Record the domains of pages that mention a bill, in any of its forms, and outputs the 50 domains that have mentioned the bill the most (with their count of pages that have mentioned the bill)
  3. Output the top 50 words found across all pages that mention a bill in any of its forms, less a set of 100 very common words

These functions are called from the file TotalAnalysis.