Skip to content

FrankGrimm/omen

Repository files navigation

License: MIT Docker

OMEN - A collaborative annotation platform

Introduction

The OMEN platform is a self-hosted annotation paltform with multi-user support. It is deployed as a Docker container and, by default, backed by a PostgreSQL database. It is maintained by the Semantic Computing Group at the Center for Cognitive Interaction Technology (CITEC), Bielefeld University

  • Simple dataset management for document-level annotation tasks.
  • Role-based dataset and task management: Users interact with datasets as either owner, curator, or annotator (each providing different levels of functionality and access).
  • Work package definitions for annotation tasks based on subsets of your dataset.

Visual Overview

A simple annotation task with three labels. Each label is configured to use a background color and display an icon. Annotators can either click the label buttons or press a hot-key (numbers 1-3 - for quick access to up to 9 labels):

Annotation Overview

Users with the creator or owner role can access further management functionality, e.g. browsing the dataset, seeing the distribution of all annotations, and inspecting the inter-annotator agreement on the dataset overall:

Dataset Curation

Configuring a dataset is as easy as uploading a CSV file, choosing columns to identify samples and their content, and configuring the possible labels:

Dataset Creation

Getting started

To use OMEN on your own tasks, please make sure to install the following prerequisites:

The software is made available as a Docker container via GitHub packages. For regular deployments, the omen-prod package should be used.

Note that pulling images, even public ones, from GitHub's infrastructure requires authentication. A personal authentication token with the read:packages permission is required. Please see the GitHub documentation on how to set this up (cat ~/TOKEN.txt | docker login https://docker.pkg.github.com -u USERNAME --password-stdin).

Alternatively you can choose to pull the production image from dockerhub.

To pull the image using the command line: docker pull docker.pkg.github.com/frankgrimm/omen/omen-prod:latest

Our standard deployment model uses Docker compose. An example docker-compose.yml configuration that sets up a database and OMEN instance can be found in the examples/ directory of the repository. Note that this example requires mapping the database files to a volume in order to be retained when the infrastructure is restarted.

After pulling the image and configuring your preferred deployment method, make sure to:

  • a) Adjust your configuration with the mandatory parameters (e.g. database connection and credentials)
  • b) Provide it to the container by mapping a volume and expose the web server (running on port TCP/5000 by default) so you can reach the web application
  • c) Create a first user via the command line and try to log in (by default at http://yourhost.domain.tld:5000). You will only have to do this once, additional users can be created in the application itself.

Note that this deployment example ends at the container level, it should be used behind a https-enabled reverse proxy (e.g. nginx, Apache, Caddy 2, or similar).

   [...]
   ports:
       - "5000:5000"
   volumes:
       - ${PWD}/config.json:/home/omenuser/app/config.json

Getting started (Developer edition)

Check out our milestones and issues to see what is going on with the project. If you want to get started, go ahead and fork the project. We provide two ready-to-go docker-compose configurations. These should work for most setups and are also used to configure the CI (via GitHub actions) and package releases:

  • docker-compose.dev.yml which runs the current OMEN branch using Flask debug (featuring auto-reloading) and configures a local database within the same compose-network. This should be the default configuration for development
    docker-compose --env-file /dev/null -f docker-compose.dev.yml up
  • docker-compose.prod.yml which runs a full gunicorn instance and is used to create the production release. This is otherwise mostly used directly when setting up test and staging environments.
    docker-compose --env-file /dev/null -f docker-compose.prod.yml up

3rd party licenses