# Annotations

This notebook is used to annotate a group of utterances from the dataset by Svanes et al. 2020 as a small experiment to get some sense of the quality of the data and to calculate inter-annotator agreement using Fleiss' Kappa. 

# Definitions from Svanes and Gunstad
Guidelines are attached in their appendix. 

The following section is adopted from Svanes and Gunstad (2020).

<img src="images/hateful1.png">

<img src="images/hateful2.png">

<img src="images/moderately_hateful.png">

<img src="images/offensive1.png">

<img src="images/offensive2.png">

<img src="images/provocative1.png">

<img src="images/provocative2.png">

<img src="images/neutral.png">

# Code

### Imports

In [None]:
import pandas as pd
import os

import warnings
warnings.filterwarnings("ignore")

### Import data to annotate

In [None]:
df = pd.read_csv(os.getcwd() + "/annotated_combined.csv")
df_offensive = pd.read_csv(os.getcwd() + "/annotated_offensive_combined.csv")

In [None]:
display(df.head(3))
display(df_offensive.head(3))

In [None]:
df_offensive.to_csv(os.getcwd()+"/annotated_combined.csv",index = False)

In [None]:
ANNOTATOR = "vilde"

In [None]:
# Annotation function
def annotate(df = df, i = 0):
    IDS = df.id.values
    grp = 0


    while i <len(IDS):
        ID = IDS[i]
        row = df[df.id == ID]
        
        print()
        print("ID: ", ID)
        print(row.text.values[0], "\n")
        
        print()

        if grp == -1:
            i -=1
            continue
        elif grp == " " or grp == "":
            continue  
        elif grp == -10:
            print("Stopped annotating at:", i, "  |  ID: ", ID)
            print()
            break
        
        df.at[i, ANNOTATOR] = grp
        i += 1

# Run annotations

The ```annotation(i)```-function takes ```i``` as input which is the starting row of all annotations. The default value is 0 which corresponds to the first row. 


Run the cell below to start annotating with the following commands:
* Annotating category:  ```[1,5]``` in which 1 is neutral, 2 provocative, 3 offensive, 4 moderately hateful, 5 hateful.
* Regret previous annotation: ```-1```
* Stop annotating: ```-10```

### Five classes 

In [None]:
#annotate() 

#### Save annotations
Uncomment to save

In [None]:
#df.to_csv(os.getcwd() + '/annotated_' + ANNOTATOR +".csv", index = False)
#test = pd.read_csv(os.getcwd() + '/annotated_' + ANNOTATOR +".csv")
#test.tail()

### Only offensive classes
The same annotation process but for 25 samples from each of the offensive classes.

In [None]:
annotate(df = df_offensive, i = 0)

In [None]:
display(df_offensive.head()) # Se over dataframen at annotasjonene er med
display(df_offensive.tail()) # Skal ikke være 0 på noen av disse

#### Save annotations 

In [None]:
df_offensive.to_csv(os.getcwd() + '/annotated_offensive_' + ANNOTATOR +".csv", index = False)
test = pd.read_csv(os.getcwd() + '/annotated_offensive_' + ANNOTATOR +".csv")

In [None]:
test.tail(5) #Sjekker at filen har blitt lagret og at den kan leses inn og ser ut som den skal 

In [None]:
test.head() # -"-

# Notes