Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
doc
.gitignore
LICENSE
README.md
newsemote.py
nltk.txt
requirements.txt
runtime.txt
scraper.py
textanalyser.py

README.md

News Emote Analyser

News source headline scraper and analyser for, viewable with news-emote

Development

First time on new computer/clone:

$ git clone https://github.com/ri/news-emote-analyser analyser
$ cd analyser
$ virtualenv .
$ source bin/activate  # enters the project's virtualenv
(analyser) $ pip install -r requirements.txt

Then every time after:

$ source bin/activate  # enters the project's virtualenv
(analyser) $ ...

After installing new dependencies with pip:

(analyser) $ pip install "new-dependency"
(analyser) $ pip freeze > requirements.txt
(analyser) $ git add requirements.txt

Running the scraper:

(analyser) $ python newsemote.py [au|us|all]

You may have to put S3 credentials in ~/.boto as described here.

Deployment

Analyser is deployed on Heroku using the Python and PhantomJS buildpacks.

S3 credentials for storing the resulting data files are in Heroku environment variables and can be viewed with heroku config and changed with heroku config:set.

The Scheduler add-on runs the scraper for each region daily. To run them manually on Heroku, run:

$ heroku run -a news-emote-analyser python newsemote.py [au|us|all]