# Gold Annotation Support
This demo uses the demo CATMA project.
If you want to use it for your own annotations you first have to clone your CATMA Project locally.\
See for further informations about cloning your CATMA Project this [Notebook](https://github.com/forTEXT/catma_gitlab/tree/main/demo_notebooks/load_project_from_gitlab.ipynb).

## Table Of Contents
* [Introduction](#1-bullet)
* [Load the CATMA project](#2-bullet)
* [Create automated Gold Annotation](#3-bullet)

## Introduction <a class="anchor" id="1-bullet"></a>
To support the creation of Gold Annotations, the catma_gitlab package can be used to copy the matching annotations of two annotators into a Gold Annotation Collection.
In the CATMA user interface the missing annotations can be created manually and the automatically created Gold Annotation can be revised.

What is considered a matching annotation can be customized.
Explanations follow below.

To proceed, it is necessary that an empty annotation collection has already been created in the CATMA UI, into which the Gold Annotation will be created. 

## Load the CATMA Project <a class="anchor" id="2-bullet"></a>

In [2]:
from catma_gitlab import CatmaProject

In [3]:
my_project = CatmaProject(
    project_directory='../test/demo_project/',
    project_uuid='test_corpus_root'
)

Loading Tagsets ...
	 Found 1 Tagset(s).
Loading Documents ...
	 Found 1 Document(s).
Loading Annotation Collections ...
	 Loading Annotation Collection 'ac_2' for Kafka Franz Das Urteil
	-> with 6 Annotations.
	 Loading Annotation Collection 'ac_1' for Kafka Franz Das Urteil
	-> with 6 Annotations.
	 Loading Annotation Collection 'gold_annotation' for Kafka Franz Das Urteil
	-> with 0 Annotations.


As can been seen above, there has been created an empty annotation collection.
We will use this annotation collection as target location for the gold annotations.

## Create Automated Gold Annotation <a class="anchor" id="3-bullet"></a>
The method `create_gold_annotations()` compares 2 Annotation Collections.
These are named by the first 2 arguments: `ac_1_name` and `ac_2_name`
This is followed by the:
- name of the Gold Annotation Collection: `gold_ac_name`
- a list with tag names, that should not be considered when creating the gold annotations: `excluded_tags`
- the minimal annotation overlap that is consindered a match: `min_overlap`. At least, this should be set to `0.5`.
- whether only annotations with the same tag are considered a match: `same_tag`.
 If your annotation project is more focused on segmention or unitizing this might be set to `False`.
- whether `'matching'` or `'none'` Property Values should be included in the gold annotations: `property_values`
- whether the new created gold annotations should be pushed to the CATMA GitLab: `push_to_gitlab`

In [4]:
my_project.create_gold_annotations(
    ac_1_name='ac_1',               # change to your collection name if you don't use the demo project
    ac_2_name='ac_2',               # change to your collection name if you don't use the demo project
    gold_ac_name='gold_annotation',
    excluded_tags=[],
    min_overlap=0.95,               # raise to 1.0 if only full annotation pairs with full span matching shall be included in gold annotations
    same_tag=True,
    property_values='matching',     
    push_to_gitlab=False            # The push to gitlab will not work with the demo project
)


Found 6 annotations in Annotation Collection: 'ac_1'.
Found 6 annotations in Annotation Collection: 'ac_2'.
-------------
Wrote 2 gold annotations in Annotation Collection 'gold_annotation'.



Additionaly to the new created gold annotations, the method `create_gold_annotations()` returns a short report.
In the demo project, only two annotations met the defined criteria for matches.

If you set `push_to_gitlab` to `True` you can continue to work with the gold annotations after you synchronized the project in the CATMA User Interface:
<img src="demo_img/project_synchronize.PNG">