# Explore Annotations
This demo uses the demo CATMA project.
If you want to use it for your own annotations you first have to clone your CATMA Project locally.\
See for further informations about cloning your CATMA Project this [Notebook](https://github.com/forTEXT/gitma/tree/main/demo_notebooks/load_project_from_gitlab.ipynb).

## Table Of Contents
* [Load your CATMA project](#1-bullet)
* [General Project Stats](#2-bullet)
* [Annotation Overview for the Project](#3-bullet)
* [Plot Annotations for specified Annotation Collection](#4-bullet)
* [Annotation Collection as Pandas DataFrame](#5-bullet)
* [Annotation Stats by Tags](#6-bullet)
* [Annotation Stats by Properties](#7-bullet)

## Load a CATMA project <a class="anchor" id="1-bullet"></a>

In [1]:
from gitma import CatmaProject

In [2]:
my_project = CatmaProject(
    project_directory='../test/demo_project/',
    project_uuid='test_corpus_root'
)

Loading Tagsets ...
	 Found 1 Tagset(s).
Loading Documents ...
	 Found 1 Document(s).
Loading Annotation Collections ...
	 Loading Annotation Collection 'ac_2' for Kafka Franz Das Urteil
	-> with 6 Annotations.
	 Loading Annotation Collection 'ac_1' for Kafka Franz Das Urteil
	-> with 6 Annotations.
	 Loading Annotation Collection 'gold_annotation' for Kafka Franz Das Urteil
	-> with 0 Annotations.


## General Project Stats <a class="anchor" id="2-bullet"></a>
The method `stats()` show you some metadata to your annotation collections.

In [3]:
my_project.stats()

Unnamed: 0,annotations,authors,tags,first_annotation,last_annotation,uuid
ac_1,6,{MVauth},"{process, stative_event}",2021-08-19 13:45:10,2021-08-19 13:48:40,C_8AB3C081-E606-40B5-98D9-A51CBFE29E6D
ac_2,6,"{MMeister, MVauth}","{process, stative_event}",2021-08-19 14:08:20,2021-08-19 14:13:02,C_7EE5CB3F-41E8-4EA3-A7C7-41F0B19579C0


## Annotation Overview for the complete Project <a class="anchor" id="3-bullet"></a>
Using the method `plot_interactive()` the annotations of each annotation collection and each document get plotted as a single subplot.

In the demo project exists only one annotated document.

In [4]:
my_project.plot_interactive(color_col='annotator')

In [5]:
my_project.plot_interactive(color_col='annotation collection')

In [6]:
my_project.plot_interactive(color_col='tag')

## Plot Annotations for specified Annotation Collection <a class="anchor" id="4-bullet"></a>
The annotations of single annotation collections can be plotted as interactive [Plotly](https://plotly.com/python/) plot, too.
The annotations can be explored with respect to
- their tag: y-axis
- their text position: x-axis
- the annotated text passages: mouse over
- their properties: mouse over

In [7]:
my_project.ac_dict['ac_2'].plot_annotation_overview()

You can customize the plot by choosing annotation properties for the y_axis and the scatter color.

In [8]:
my_project.ac_dict['ac_2'].plot_annotation_overview(
    prop='mental')

In [9]:
my_project.ac_dict['ac_2'].plot_annotation_overview(
    y_axis='annotator', color_prop='unpredictable')

## Annotations as Pandas DataFrame <a class="anchor" id="5-bullet"></a>

In [10]:
my_project.ac_dict['ac_2'].df

Unnamed: 0,document,annotation collection,annotator,tag,left_context,annotation,right_context,start_point,end_point,date,prop:unpredictable,prop:mental,prop:persistent,prop:intentional
0,Kafka Franz Das Urteil,ac_2,MMeister,stative_event,ESCHICHTE VON FRANZ KAFKA\nfür Fräulein Felice...,Es war an einem Sonntagvormittag im schönsten ...,"Georg Bendemann, ein junger Kaufmann, saß in ...",67,122,2021-08-19 14:08:20,[no],[no],[nan],[nan]
1,Kafka Franz Das Urteil,ac_2,MVauth,stative_event,an einem Sonntagvormittag im schönsten Frühja...,"Georg Bendemann, ein junger Kaufmann, saß in ...",. Er hatte gerade einen Brief an einen sich im...,123,356,2021-08-19 14:08:31,[no],[no],[nan],[nan]
2,Kafka Franz Das Urteil,ac_2,MVauth,stative_event,"er Höhe und Färbung unterschieden, sich hinzog...",Er hatte gerade einen Brief an einen sich im ...,", verschloß ihn in spielerischer Langsamkeit u...",358,443,2021-08-19 14:10:46,[no],[no],[nan],[nan]
3,Kafka Franz Das Urteil,ac_2,MMeister,process,sich im Ausland befindenden Jugendfreund beend...,verschloß ihn in spielerischer Langsamkeit,"und sah dann, den Ellbogen auf den Schreibtis...",445,487,2021-08-19 14:11:05,[no],[no],[no],[no]
4,Kafka Franz Das Urteil,ac_2,MVauth,process,"endet, verschloß ihn in spielerischer Langsamk...","und sah dann aus dem Fenster auf den Fluß, di...",".\n\nEr dachte darüber nach, wie dieser Freund...",488,643,2021-08-19 14:12:36,[no],[yes],[no],[yes]
5,Kafka Franz Das Urteil,ac_2,MVauth,stative_event,oß ihn in spielerischer Langsamkeit und sah da...,den Ellbogen auf den Schreibtisch gestützt,", aus dem Fenster auf den Fluß, die Brücke und...",502,544,2021-08-19 14:13:02,[no],[no],[nan],[nan]


## Annotation Stats by Tags <a class="anchor" id="6-bullet"></a>
The `tag_stats()` method counts for each tag
- the number of annotations
- the full text span annotated by the tag
- the average text span of the annotations
- the most frequent tokens (here, it is possible to define a stopword list)

In [11]:
my_project.ac_dict['ac_2'].tag_stats(ranking=5)

Unnamed: 0,annotations,text_span,text_span_mean,token1,token2,token3,token4,token5
stative_event,4,415,103.75,im: 3,in: 3,an: 2,der: 2,sich: 2
process,2,197,98.5,und: 2,die: 2,verschloß: 1,ihn: 1,in: 1


Additionally, you can use the method for Properties if you used any in the annotation process and different annotators:

In [12]:
my_project.ac_dict['ac_2'].tag_stats(tag_col='prop:mental', ranking=3, stopwords=['in', 'im'])

Unnamed: 0,annotations,text_span,text_span_mean,token1,token2,token3
no,5,457,91.4,an: 2,der: 2,sich: 2
yes,1,155,155.0,und: 2,die: 2,sah: 1


Here, every row shows the data for the different Property Values.

In [13]:
my_project.ac_dict['ac_2'].tag_stats(tag_col='annotator', ranking=3)

Unnamed: 0,annotations,text_span,text_span_mean,token1,token2,token3
MMeister,2,97,48.5,Es: 1,war: 1,an: 1
MVauth,4,515,128.75,in: 3,die: 3,und: 3


Here, every row shows the data for the different annotators.

## Annotation Stats by Properties / Property Values <a class="anchor" id="7-bullet"></a>

In [14]:
my_project.ac_dict['ac_2'].property_stats()

Unnamed: 0,nan,no,yes
prop:unpredictable,,6.0,
prop:mental,,5.0,1.0
prop:persistent,4.0,2.0,
prop:intentional,4.0,1.0,1.0
