Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

WeSTCAT - Web Science and Technologies Content Analysis Toolkit


WeSTCAT is an extended version of the AmCAT software for human rating ("coding") of texts, originally customized by Christoph Kling, now maintained by Aemal Sayer.

WeSTCAT is part of the Komepol project at the Institute for Web Science and Technologies at the University of Koblenz–Landau.

Installation and Configuration


Most of the (python) prerequisites for AmCAT are automatically installed using pip (see below). To install the non-python requirements, you can use the following (on ubuntu):

$ sudo apt-get install unrtf rabbitmq-server python-pip python-dev libxml2-dev libxslt-dev lib32z1-dev postgresql postgresql-server-dev-9.1 postgresql-contrib-9.1

It is probably best to install AmCAT in a virtual environment. Run the following commands to setup and activate a virtual environment for AmCAT: (on ubuntu)

$ sudo apt-get install python-virtualenv
$ virtualenv amcat-env
$ source amcat-env/bin/activate

If you use a virtual environment, every time you start working with AmCAT you need to repeat the source line to load the environment. If you don't use a virtual environment, you will need to run most pip command below using sudo.


AmCAT requires a database to store its documents in. The default settings look for a postgres database 'amcat' on localhost. To set up the current user as a superuser in postgres and create the database, use:

$ sudo -u postgres createuser -s $USER
$ createdb amcat


AmCAT uses elasticsearch for searching articles. Since we use a custom similarity to provide hit counts instead of relevance, this needs to be installed 'by hand'. You can probably skip this and rely on a pre-packaged elasticsearch if you don't care about hit counts, although you still need to install the elasticsearch plugins.

First, install oracle java (from

$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java7-installer

Next, download and extract elasticsearch and our custom hitcount jar, and install the required plugins:

curl | tar xvz
elasticsearch-1.0.0/bin/plugin -install elasticsearch/elasticsearch-lang-python/1.2.0
elasticsearch-1.0.0/bin/plugin -install elasticsearch/elasticsearch-analysis-icu/1.12.0
elasticsearch-1.0.0/bin/plugin -install mobz/elasticsearch-head

Now you are ready to start elasticsearch. I usually start it in a secondary terminal if I am running AmCAT locally so I can directly inspect error messages, but you could also run it as a daemon:

ES_CLASSPATH=hitcount.jar elasticsearch-1.0.0/bin/elasticsearch

If you faced any problem in installing elasticsearch, the following link can be also helpful:

Installing AmCAT (pip install from git)

Now you are ready to install AmCAT. The easiest way to do this is to pip install it direct from github. This is not advised unless you use a virtual environment.

pip install git+

Installing AmCAT (clone)

Alternatively, clone the project from github and pip install the requirements. If you plan to make changes to AmCAT, this is probably the best thing to do.

git clone
pip install -r amcat/requirements.txt

If you install amcat via cloning, be sure to add the new directory to the pythonpath, e.g.:


Setting up the database

Whichever way you installed AmCAT, you need tocall the syncdb command to populate the database and set the elasticsearch mapping:

python -m amcat.manage syncdb

Start celery worker

Finally, to use the query screen you need to start a celery worker. In a new terminal, type:

celery -A amcat.amcatcelery worker -l info -Q amcat

(if you are using a virtual environment, make sure to activate that first)

Configuring AmCAT

The main configuration parameters for AmCAT reside in the settings folder. In many places, these settings are defaults that can be overridden with environment variables.

Start WeSTCAT web server

Make sure that elasticsearch and celery worker are running and then execute the following in terminal:

python -m amcat.manage runserver


Smart document coding tool







No releases published


No packages published