Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Scraping, parsing and indexing the daily Congressional Record to support phrase search over time, and by legislator and date
branch: master

Update google-analytics.js

Added new universal google analytics javascript.
latest commit 9095003320
Clayton Dunwell crdunwel authored


useful info goes here

* json or simplejson
* beautifulsoup verion 3.0 series (it MUST be 3.0 series, not 3.1)
* solr
* sunlightlabs API key

* cp
* create symlinks to from each of solr/, scraper/ and parser/

* tell solr where to find the schema file. eg, if using running the dev
* environment in apache-solr-1.4.1/example/, it will uses schema.xml in the
* directory /apache-solr-1.4.1/example/solr/conf. same is true for the
* stopwords file. so set up symlinks to he real things, optionally backing up
* the originals as .example. 

cd apache-solr-1.4.1/example/solr/conf
mv schema.{,example.}xml
mv stopwords.{,example.}txt
ln -s /home/cwod/capitolwords/src/solr/schema.xml schema.xml
ln -s /home/cwod/capitolwords/src/solr/stopwords.txt stopwords.txt

* start up solr. in a dev environment this looks like:
  cd $SOLR_DIR/example
  java -jar start.jar (uses jetty)

Something went wrong with that request. Please try again.