This notebook calculates interannotator agreement rates for data with categorical labels (using Cohen's $\kappa$) and real values (using Krippendorff's $\alpha$).  We calculate these rates for two tasks: subjectivity/objectivity and suspense.

#### Cell 2: Importing Libraries
This first code block imports the necessary classes and functions from the `nltk` (Natural Language Toolkit) library. It also imports the `sys` module, although it is not used in the provided code.

In [None]:
# Import AnnotationTask, which is the main class for calculating agreement scores.
from nltk.metrics.agreement import AnnotationTask

# Import distance metrics. interval_distance is for real-valued data (like 1.0, 2.5, 5.0).
# binary_distance is for categorical data where labels are either the same or different (e.g., "A" vs "B").
from nltk.metrics import interval_distance, binary_distance 

# Import the sys module for system-specific parameters and functions (not used in this notebook).
import sys

#### Cell 3: Krippendorff's Alpha Function
This cell defines a function to calculate Krippendorff's $\alpha$ (alpha). This metric is ideal for measuring agreement between annotators when the data is numerical (integer, or real-valued) or ordinal. It can handle multiple raters and missing data.

In [None]:
# Define a function that takes a list of annotation triples as input.
# A triple is in the format (annotator_id, item_id, label).
def krippendorff_alpha(annotation_triples):

    # Create an AnnotationTask instance. 
    # The 'distance' parameter is set to interval_distance because we're dealing with real-valued data.
    t = AnnotationTask(annotation_triples, distance=interval_distance)
    
    # Calculate Krippendorff's alpha for the task.
    result = t.alpha()
    
    # Print the result, formatted to three decimal places.
    print("%.3f" % result)

#### Cell 4: Cohen's Kappa Function
This cell defines a function to calculate Cohen's $\kappa$ (kappa). This metric is typically used to measure agreement between **two** annotators on a **categorical** task.

*Note: The code uses `interval_distance` here. For a classic Cohen's Kappa with binary or nominal categories, `binary_distance` would be more appropriate. Using `interval_distance` makes it a weighted kappa.*

In [None]:
# Define a function that takes a list of annotation triples as input.
def cohens_kappa(annotation_triples):

    # Create an AnnotationTask instance.
    # Note: interval_distance is used, which makes this a weighted kappa.
    t = AnnotationTask(annotation_triples, distance=interval_distance)
    
    # Calculate Cohen's kappa for the task.
    result = t.kappa()
    
    # Print the result, formatted to three decimal places.
    print("%.3f" % result)

#### Cell 5: File Reading Function
This function, `read_annos`, is designed to read a tab-separated values (`.tsv`) file. It assumes the file has two columns: the annotation value and the corresponding sentence.

In [None]:
# Define a function to read annotations from a given filename.
def read_annos(filename):
    # Initialize an empty list to store annotation scores.
    annos=[]
    # Initialize an empty list to store the sentences.
    sentences=[]
    # Open the specified file with UTF-8 encoding.
    with open(filename, encoding="utf-8") as file:

        # Read the first line (header) and split it by tabs, but don't use it.
        header=file.readline().rstrip().split("\t")
            
        # Loop through the rest of the lines in the file.
        for line in file:
            # Strip whitespace from the end of the line and split it into columns by tabs.
            cols=line.rstrip().split("\t")
            # Convert the first column to a float and add it to the 'annos' list.
            annos.append(float(cols[0]))
            # Add the second column (the sentence) to the 'sentences' list.
            sentences.append(cols[1])
    # Return the two lists: one for annotations and one for sentences.
    return annos, sentences

#### Cell 6: Data Conversion Function
This function, `convert_anno_list`, transforms a simple list of annotation scores into the structured format required by the NLTK `AnnotationTask` class. The required format is a list of tuples, where each tuple is `(annotator_id, item_id, label)`.

In [None]:
# Define a function that takes a list of annotations and an annotator ID.
def convert_anno_list(annos, annotator_id):
    # Initialize an empty list to store the converted data.
    converted=[]
    # Loop through the annotations list with both index (idx) and value (anno).
    for idx, anno in enumerate(annos):
        # Create a tuple with the annotator's ID, the item's index, and the annotation value.
        # Then, append this tuple to the 'converted' list.
        converted.append((annotator_id, idx, anno))
    # Return the newly formatted list of annotation triples.
    return converted

#### Cell 7 & 8: File Path Placeholders
These cells are placeholders where you must specify the paths to your two annotation files. One file is for the first annotator, and the other is for the second.

In [None]:
# Assign the file path for the first annotator's data to a variable.
# YOU MUST REPLACE "path to your filename name" WITH YOUR ACTUAL FILE PATH.
anno1_filename="path to your filename name"

In [None]:
# Assign the file path for the second annotator's data to a variable.
# YOU MUST REPLACE "path to group annotation file here" WITH YOUR ACTUAL FILE PATH.
anno2_filename="path to group annotation file here"

#### Cell 9 & 10: Loading Annotation Data
These cells use the `read_annos` function defined earlier to load the data from the specified file paths into memory.

In [None]:
# Call the read_annos function to load data for the first annotator.
# 'anno1' will contain the scores, and 'sentences' will contain the corresponding text.
anno1, sentences=read_annos(anno1_filename)

In [None]:
# Call the read_annos function to load data for the second annotator.
# We only need the scores ('anno2'), so we use '_' to ignore the sentences list.
anno2, _=read_annos(anno2_filename)

#### Cell 11: Data Validation
This is a crucial sanity check. It ensures that both annotators have rated the exact same number of items. If the counts differ, inter-annotator agreement cannot be calculated correctly.

In [None]:
# Check if the number of annotations in the first list is different from the second.
if len(anno1) != len(anno2):
    # If they are different, print an error message showing the two different counts.
    print ("Different number of annotations: %s vs. %s" % len(anno1), len(anno2))

#### Cell 12: Error Analysis
This loop is for manual error analysis. It iterates through the annotations and prints out the items where the two annotators disagreed significantly (in this case, by a score of 1.0 or more). This helps identify ambiguous items or potential misunderstandings of the annotation guidelines.

In [None]:
# Print out sentences with different annotations to see where annotators disagreed.
# Loop through the indices from 0 to the total number of annotations.
for idx in range(len(anno1)):
    # Check if the absolute difference between the two annotators' scores is 1 or greater.
    if abs(anno1[idx]-anno2[idx]) >= 1:
        # If the disagreement is large, print the scores from both annotators and the sentence itself.
        print("%s\t%s\t%s" % (anno1[idx], anno2[idx], sentences[idx]))

#### Cell 13 & 14: Formatting Data for NLTK
Here, the loaded annotation lists are converted into the `(annotator, item, label)` triple format that the `AnnotationTask` class requires. Annotator 1 is assigned ID `0`, and Annotator 2 is assigned ID `1`.

In [None]:
# Convert the first annotator's list into the required format, using annotator ID 0.
anno1=convert_anno_list(anno1, 0)

In [None]:
# Convert the second annotator's list into the required format, using annotator ID 1.
anno2=convert_anno_list(anno2, 1)

Objectivity is a binary rating, so use Cohen's $\kappa$.

#### Cell 16: Executing Cohen's Kappa
This cell combines the two formatted annotation lists and passes them to the `cohens_kappa` function to calculate and print the agreement score for the binary task.

In [None]:
# Concatenate the two formatted annotation lists into a single list.
# Then, pass this combined list to the cohens_kappa function to calculate the score.
cohens_kappa(anno1 + anno2)

Suspense is a real-valued rating, so use Krippendorff's $\alpha$.

#### Cell 18: Executing Krippendorff's Alpha
Finally, this cell calculates the agreement for the real-valued task (suspense) by passing the combined data to the `krippendorff_alpha` function.

In [None]:
# Concatenate the two formatted annotation lists into a single list.
# Then, pass this combined list to the krippendorff_alpha function to calculate the score.
krippendorff_alpha(anno1 + anno2)