# Explore Annotations
This demo uses the demo CATMA project.
If you want to use it for your own annotations you first have to clone your CATMA Project locally.\
See for further informations about cloning your CATMA Project this [Notebook](https://github.com/forTEXT/gitma/tree/main/demo_notebooks/load_project_from_gitlab.ipynb).

## Table Of Contents
* [Load your CATMA project](#1-bullet)
* [General Project Stats](#2-bullet)
* [Annotation Overview for the Project](#3-bullet)
* [Plot Annotations for specified Annotation Collection](#4-bullet)
* [Annotation Collection as Pandas DataFrame](#5-bullet)
* [Annotation Stats by Tags](#6-bullet)
* [Annotation Stats by Properties](#7-bullet)

## Load a CATMA project <a class="anchor" id="1-bullet"></a>

In [None]:
from gitma import CatmaProject

In [None]:
my_project = CatmaProject(
    project_directory='../test/demo_project/',
    project_name='test_corpus'
)

## General Project Stats <a class="anchor" id="2-bullet"></a>
The method `stats()` show you some metadata to your annotation collections.

In [None]:
my_project.stats()

## Annotation Overview for the complete Project <a class="anchor" id="3-bullet"></a>
Using the method `plot_interactive()` the annotations of each annotation collection and each document get plotted as a single subplot.

In the demo project exists only one annotated document but two annotation collections.
By a click on the legend entries you can deactivate single annotation collection within the plot.

By hovering over the scatter point every annotation can be explored.

In [None]:
my_project.plot_interactive()

The plot can be customized by the `color_col` parameter,
for example to vizualize property annotations

In [None]:
my_project.plot_interactive(color_col='prop:intentional')

... or the annotators...

In [None]:
my_project.plot_interactive(color_col='annotator')

## Plot Annotations for specified Annotation Collection <a class="anchor" id="4-bullet"></a>

### Scatter Plot
The annotations of single annotation collections can be plotted as interactive [Plotly Scatter Plot](https://plotly.com/python/) plot, too.
The annotations can be explored with respect to
- their tag: y-axis
- their text position: x-axis
- the annotated text passages: mouse over
- their properties: mouse over

In [None]:
my_project.ac_dict['ac_2'].plot_annotations()

You can customize the plot by choosing annotation properties for the y_axis and the scatter color.

In [None]:
my_project.ac_dict['ac_2'].plot_annotations(prop='mental')

In [None]:
my_project.ac_dict['ac_1'].plot_annotations(
    y_axis='annotator',
    color_prop='unpredictable'
)

### Cooccurrence Networks
An alternive way to vizualize annotation collections are networks.
They can be used to get an inside into the cooccurrence of annotations.

In [None]:
my_project.ac_dict['ac_1'].cooccurrence_network()

The networks can be customized by the following optional paramters:

- character_distance: the text span in which two annotations are considered cooccurrent. Here, the default ar 100 characters.
- included_tags: a list of tags that get included in drawing the graph
- excluded_tags: a list of tags that don't get included in drawing the graph

Since the demo project contains only 6 annotations with 2 tags in both annotation collections these paramters don't make a difference:

In [None]:
my_project.ac_dict['ac_1'].cooccurrence_network(
    character_distance=50,
    included_tags=['process', 'stative_event'],
    excluded_tags=None
)

This example from a larger annotation collection shows that the edge's weight vizualize the count of cooccurrences:

<img src="demo_img/network_example.png">

## Annotations as Pandas DataFrame <a class="anchor" id="5-bullet"></a>

In [None]:
my_project.ac_dict['ac_2'].df

## Annotation Stats by Tags <a class="anchor" id="6-bullet"></a>
The `tag_stats()` method counts for each tag
- the number of annotations
- the full text span annotated by the tag
- the average text span of the annotations
- the most frequent tokens (here, it is possible to define a stopword list)

In [None]:
my_project.ac_dict['ac_2'].tag_stats(ranking=5)

Additionally, you can use the method for Properties if you used any in the annotation process and different annotators:

In [None]:
my_project.ac_dict['ac_2'].tag_stats(tag_col='prop:mental', ranking=3, stopwords=['in', 'im'])

Here, every row shows the data for the different Property Values.

In [None]:
my_project.ac_dict['ac_2'].tag_stats(tag_col='annotator', ranking=3)

Here, every row shows the data for the different annotators.

## Annotation Stats by Properties / Property Values <a class="anchor" id="7-bullet"></a>

In [None]:
my_project.ac_dict['ac_2'].property_stats()