Skip to content

Annotation Tutorial

Lee Dong Ho edited this page Jul 27, 2019 · 28 revisions

Create annotator accounts

After the Installation, we can start both the annotation server and the model server. Make sure you can open the link of the annotation server, say it is 127.0.0.1:8080 and you can visit "http://0.0.0.0:8080/login". Note that the link "http://0.0.0.0:8080/" will redirect to our homepage.

First, the administrator should log in the user management system (http://127.0.0.1:8080/admin/) with the admin account name and password (you created this before when you installed the framework).

Then, the admin can add accounts for annotators (with only the permission to annotate sentences) following the guide on the page.

Set up a new project

Create Project

The admin then can go to the project management system (http://127.0.0.1:8080/projects/) to create a new project to host the annotation over a corpus.

Upload Datasets

The admin needs to convert the dataset into CSV or JSON format. Then upload it to the server.

import json
with open('dataset.txt', 'a') as dataset_file:
    for I, sent in enumerate(sentences):
        data = {}
        data['text'] = sent
        data['external_id'] = I
        json.dump(data, dataset_file)
        dataset_file.write('\n')

Define labels

When you first create the project, you need to pre-define the labels first for the online learning. Online learning needs mappings of words and labels to feature indices at the initial stage

Annotate the sentences with the intelligent recommendations

There are two sections (Annotation, Recommendation section).

  1. You can directly annotate in the upper section by simply selecting the spans
  2. To use the recommendation in the lower section, click the suggested span which is underlined. suggested type is bounded with red line. to confirm the annotation, click a suggested type or press a shortcut key.

Settings

You can optionally enable the embeddings, recommendation options, active learning methods and set the batch size, epoch and acquire size.

ezgif.com-video-to-gif-23e0beda769d89d18.gif

  • Embedding supports : Glove, Word2Vec, FastText, ELMO, GPT, BERT
  • Recommendation options : Noun Chunk, Model Inference, Dictionary Match
  • Active Learning options : MNLP (Maximum Normalized Log-Probability)

Recommendation Details

Final recommendations are merged from three options with the priority:
Noun Chunk < Model Inference < Dictionary Match