Represent Govhack 2012 project
Requires Python 2.6+ and the pypi packages from require.txt. OpenShift uses Python 2.6 so anything over that you should check your version specificity.
These notes are written from memory. Please correct inaccuracies. See trouble shooting section if you hit a snag.
- Install brew https://github.com/mxcl/homebrew/wiki/installation
$ sudo apt-get install python2.6 python2.6-dev python-setuptools
OS X / Ubuntu
$ sudo easy_install pip
$ pip install virtualenv virtualenvwrapper
- Install virtualenv and virtualenvwrapper. Configure your .profile as the installation instructions for virtualenvwrapper suggest. http://www.doughellmann.com/projects/virtualenvwrapper/ http://www.doughellmann.com/docs/virtualenvwrapper/
$ mkvirtualenv representOR
mkvirtualenv -a [/path/to/project]/represent -r [/path/to/project]/represent/require.txt represent
$ workon represent
$ pip install -r require.txtYou only need to do this if you haven't run the full
- Ensure that both
sudo apt-get install mysql libmysqlclient-devOR
sudo apt-get install libmysqlclient-devif you already have MySQL installed
Currently there are no settings. It will pull House of Reps Parliament #43. You can alter main.py to change this.
Cancelling a scrape leaves a settings.pickle file behind. This allows you to resume the scrape later. To start a new scrape delete this file.
$ cd hansard-getter
$ python main.py
Currently there are no settings. It is hardwired to parse local files in the data directory. It probably doesn't do this recursively--you should check.
- Create a local MySQL database.
$ cd hansard-parser
$ cp local_settings.py.example local_settings.py
- Configure your database settings in settings.conf (currently mysql on localhost only).
$ python main.py
Select all speeches by the ALP
SELECT * FROM speech WHERE party = "ALP"