# Gold annotation support
This demo uses the demo CATMA project.
If you want to use it for your own annotations you first have to clone your CATMA project locally.
For further information about cloning your CATMA project see [this notebook](https://github.com/forTEXT/gitma/blob/main/demo/notebooks/load_project_from_gitlab.ipynb).

## Table of contents
* [Introduction](#1-bullet)
* [Load the CATMA project](#2-bullet)
* [Create automated gold annotations](#3-bullet)

## Introduction <a class="anchor" id="1-bullet"></a>
To support the creation of gold annotations, this package can be used to copy the matching annotations of two annotators into a gold annotation collection.
In the CATMA user interface the missing annotations can be created manually and the automatically created gold annotations can be revised.

What is considered a matching annotation can be customized.
Explanations follow below.

To proceed, it is necessary that an empty annotation collection has already been created in the CATMA UI, into which the gold annotations will be written.

## Load the CATMA project <a class="anchor" id="2-bullet"></a>

In [None]:
from gitma import CatmaProject

In [None]:
my_project = CatmaProject(
    projects_directory='../projects/',
    project_name='CATMA_9385E190-13CD-44BE-8A06-32FA95B7EEFA_GitMA_Demo_Project'
)

As can been seen above, an empty annotation collection has already been created.
We will use this annotation collection as the target location for the gold annotations.

## Create automated gold annotations <a class="anchor" id="3-bullet"></a>
The method `create_gold_annotations()` compares 2 annotation collections.
These are specified by the first 2 arguments: `ac_1_name` and `ac_2_name`.
This is followed by the:
- name of the gold annotation collection: `gold_ac_name`
- a list of tag names that should not be considered when creating the gold annotations: `excluded_tags`
- the minimal annotation overlap that is required for a match: `min_overlap`. This should be set to at least `0.5`.
- whether only annotations with the same tag are considered a match: `same_tag`.
 If your annotation project is more focused on segmentation or unitizing this might be set to `False`.
- whether `'matching'` or `'none'` property values should be included in the gold annotations: `property_values`
- whether the newly created gold annotations should be pushed to CATMA's GitLab backend: `push_to_gitlab` *(Note that to use this option you currently need a separate Git installation with valid saved credentials for your CATMA account)*

In [None]:
my_project.create_gold_annotations(
    ac_1_name='ac_1',               # change to your collection name if you don't use the demo project
    ac_2_name='ac_2',               # change to your collection name if you don't use the demo project
    gold_ac_name='gold_annotation',
    excluded_tags=[],
    min_overlap=0.95,               # raise to 1.0 if only full annotation pairs with full span matching should be included in gold annotations
    same_tag=True,
    property_values='matching',     
    push_to_gitlab=False            # the push to GitLab will not work with the demo project
)

In addition to the newly created gold annotations, the method `create_gold_annotations()` returns a short report.
In the demo project, 16 annotations met the defined criteria for matches.

If you set `push_to_gitlab` to `True` you can continue to work with the gold annotations after you've synchronized the project in the CATMA user interface:
<img src="img/project_synchronize.png">