Pipeline is a data management and analysis site built by Aquaya. View the full site at pipelinehq.org. Our partners collect information in a variety of formats and this site helps aggregate, de-duplicate, edit and view those data points. Data can be uploaded via an Excel file or automatically imported from several mobile data collection services such as DiMagi's CommCare. Once uploaded, information can be filtered, statistically analyzed, and graphed. Reports can be created and periodically sent to managers via email or as simple SMS notifications.
Tested on Ubuntu 11.10
You'll need a locally-running mongodb instance - see their docs for info.
use virtualenv and pip to install other reqs
$ virtualenv /path/to/venv $ . /path/to/venv/bin/activate $ (venv) pip install -r requirements.txt
after cloning this repo, pull in the dependencies:
$ git submodule init $ git submodule update
for report-generation we use
$ sudo apt-get install xvfb $ sudo apt-get install xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic $ sudo apt-get install fontconfig $ wget http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2 $ tar xvf wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2 $ sudo mv wkhtmltopdf-amd64 /usr/bin/wkhtmltopdf
$ wget http://redis.googlecode.com/files/redis-2.4.14.tar.gz $ tar xzf redis-2.4.14.tar.gz $ cd redis-2.4.14 $ make $ src/redis-server /path/to/redis.conf $ worker $ scheduler
You should turn on daemonization in the redis config file.
Note also that only one
rqscheduler process can be attached to a redis instance.
Changing the connected
db might work but has not been tested.
we use Lettuce for some BDD tests - improving coverage is a high priority.
unittesting is via nose:
setup a real config file outside of source control
$ cp conf/application_settings_sample.py /path/to/real/settings.py
edit that new config..then point an env var at it
$ export PIPELINE_SETTINGS=/path/to/real/settings.py
activate your virtualenv and create the default admin with
$ ./path/to/venv/bin/activate (venv)$ python >> import application >> application.controllers.seed()
start the server
(venv)$ python run.py * Running on http://127.0.0.1:8000/
Usage in production
we have some example config files for supervisord, gunicorn, nginx, and fabric -- check those out
Bootstrapping a new server:
- install virtualenv and pip
- copy over config files for supervisord, gunicorn, nginx and this app
- make a dir for the config files and the log files
- point the env var to the app config file and put this in your .zshrc
- install the requirements using pip nad requirements.txt
- use fabric to install the app
- reload/reread/restart supervisord until it picks up the config file (annoyingly imprecise, I know..)
- start the server and check with supervisorctl
- edit the nginx config file at nginx.conf and the sites-availble dir (symlink to sites-enabled); restart nginx
- seed the db from a shell
- update your DNS
Accessing data from other services
Only CommCare is supported at the moment. Hoping to add IVRHub and formhub soon. CommCare instructions follow:
- invite a new web user to your CommCare project with read-only access
- create a new "Connection" in the system and add the appropriate credentials
- Pipeline will periodically request new data from CommCare using your domain and export tag info
CommCare can include a lot of helpful metadata about each submission. When creating a 'Connection' there is an option to include or exclude this metadata. If the metadata is excluded, any future manual file uploads will have to match the new schema.
Editing an entry triggers automatically the following:
- a locked comment is automatically created to describe the changes
- possible duplicates are checked and processed: if edits make the entry a dupe or unique, it is converted appropriately. If a unique value that had a duplicate is edited, one of the old duplicates is shifted to unique.
- if the entry was hidden, a warning is raised telling the user to consider un-hiding this entry
- the entry is marked as having been edited