# Notebook: Calculate Agreement

This notebook is used to calculate the inter-rater agreement using Krippendorf's Alpha.
<br>**Contributors:** [Nils Hellwig](https://github.com/NilsHellwig/) | [Markus Bink](https://github.com/MarkusBink/)

## Packages

In [1]:
from statsmodels.stats import inter_rater as irr
import krippendorff as kd
import pandas as pd
import numpy as np
import glob
import os

## Parameters

In [2]:
ANNOTATED_DATASET_PATH = "../Datasets/annotated_dataset/"
LABEL_CODING = {'NEUTRAL': 3, 'NEGATIVE': 2, 'POSITIVE': 1, 'MIXED': 4}

## Code

### 1. Load Annotations

In [3]:
file_list = sorted(glob.glob(ANNOTATED_DATASET_PATH + "*.xlsx"))

In [4]:
# read and concatenate annotator 1's session 1 and 2 data
df_annotator_1 = pd.concat([
    pd.read_excel(ANNOTATED_DATASET_PATH + "tweets_session_1_1.xlsx"),
    pd.read_excel(ANNOTATED_DATASET_PATH + "tweets_session_2_1.xlsx")
])

In [5]:
# rename sentiment column and recode labels
df_annotator_1.rename(columns={'sentiment': 'sentiment_1'}, inplace=True)
df_annotator_1['sentiment_1'] = df_annotator_1['sentiment_1'].map(LABEL_CODING)

In [6]:
# read and concatenate annotator 2's session 1 and 2 data
df_annotator_2 = pd.concat([
    pd.read_excel(ANNOTATED_DATASET_PATH + "tweets_session_1_2.xlsx"),
    pd.read_excel(ANNOTATED_DATASET_PATH + "tweets_session_2_2.xlsx")
])

In [7]:
# rename sentiment column and recode labels
df_annotator_2.rename(columns={'sentiment': 'sentiment_2'}, inplace=True)
df_annotator_2['sentiment_2'] = df_annotator_2['sentiment_2'].map(LABEL_CODING)

In [8]:
# concatenate annotator 1 and 2 data
df_all_annotations = pd.concat([
    df_annotator_1[['sentiment_1']], 
    df_annotator_2[['sentiment_2']]
], axis=1)

In [9]:
# check for missing values in sentiment columns
print(df_all_annotations[df_all_annotations['sentiment_1'].isnull()])
print(df_all_annotations[df_all_annotations['sentiment_2'].isnull()])

Unnamed: 0,sentiment_1,sentiment_2


In [10]:
df_all_annotations = df_all_annotations.reset_index(drop=True)

In [11]:
equal_values = len(df_all_annotations[df_all_annotations['sentiment_1'] == df_all_annotations['sentiment_2']])
equal_values

1674

### 2. Calculate Krippendorff's Alpha

In [12]:
# Rows are the coders (annotators) # of coders
# Columns are the individual items (sentiment of tweet) # of tweets
value_counts = df_all_annotations.loc[:, df_all_annotations.columns != 'id']
value_counts = value_counts.to_numpy().transpose()
kd.alpha(reliability_data=value_counts, level_of_measurement="nominal")

0.7266937120270791

### 3. Calculate Fleiss' Kappa

In [13]:
agg = irr.aggregate_raters(df_all_annotations.loc[:, df_all_annotations.columns != 'id'])

In [14]:
irr.fleiss_kappa(agg[0], method='fleiss')

0.7266253683691716