experimental site to make it easy to add sources to news articles
Python CSS JavaScript Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
config
db
scrapeomat
script
src_images
tools
unsourced
vendor
.gitignore
.gitmodules
README.md
TODO
db_tool
dump_hosts
export_lookups
hourlyjobs
import_articles
import_lookups
requirements.txt
virtualenv_cmd.sh

README.md

Unsourced

Setting up

Unsourced uses Python, MySQL, and memcached.

To bootstrap your setup, just run:

script/bootstrap

Notes

PIL dependency

On ubuntu 11.10/64bit:

pip doesn't install PIL properly under 64bit ubuntu - it misses zlib, libjpeg and libfreetype.

Cheesy hack workaround is to install ubuntu PIL package then copy the files into the virtualenv manually:

$ sudo apt-get install python-imaging
$ cp -r /usr/lib/python2.7/dist-packages/PIL {{VIRTUALENV}}/lib/python2.7/site-packages

On Amazon EC2 Linux AMI:

pip compiles PIL from source, so make sure all the image format libraries are installed before installing PIL - the missing ones won't be installed and you'll end up with annoying IOError: decoder jpeg not available kinds of errors.

$ sudo yum install libjpeg libjpeg-devel

Alternate setup (debian/ubuntu):

$ sudo apt-get install python-virtualenv
$ sudo apt-get install libpng-dev libgif-dev libjpeg-dev libmysqlclient-dev

(assumes you've got mysql etc already installed)

$ mkdir unsourced.org
$ cd unsourced.org
$ git clone https://github.com/bcampbell/unsourced
$ virtualenv --no-site-packages pyenv
$ . pyenv/bin/activate
$ pip install --upgrade distribute
$ pip install -r unsourced/requirements.txt

$ git clone https://github.com/bcampbell/decruft.git
$ git clone https://github.com/bcampbell/metareadability.git
$ ln -s $PWD/decruft/decruft unsourced/scrapeomat/
$ ln -s $PWD/metareadability/metareadability unsourced/scrapeomat/

$ mkdir uploads

then:

make sure uploads dir is writable by server mysqladmin create customise config.py and alembic.ini

 python unsourced/app.py
 python unsourced/scrapomat.py

Production site runs behind Nginx, and uses supervisord to manage processes. See sample config files in config/

The Debian Squeeze packages for nginx are a bit stale. Nginx.org has more recent packages available.