# HW 1 Annotation aggregation and exploratory analysis

- Can "Run All Cells" 

# TODOs

1. [x] Imports + Load Data 
2. [ ] Compute the Fleiss’ kappa inter-annotator agreement on the set of videos in the provided tabulatedVotes.csv. Report and interpret the results using Table 1
3. [x] Go through a subset of 20 IPD videos in IPD 20.csv, see the [video clips here](https://drive.google.com/drive/folders/1-at00XhJTzUwV6T7j6V8ukLe4e-DcxVs?usp=sharing). Please pay attention to the facial expressions. Choose one facial cue which you believe to have a positive correlation with Joy and one facial cue with Surprise. For example “raised eyebrows” may have a positive correlation with the perception that the person is signaling Surprise. Don’t limit yourself when selecting the facial cues.
4. [ ] Confirm your suspicions and subjective observations from the previous task with statistical analysis.
    1. [x] You can do this by manually annotating the videos for each facial cue and comparing your annotations to the raters’votes.
    2. [ ] You need to annotate and calculate the **p-value** from the **Student t-test** between the group of videos with the facial cue of your choice and the group without. 
    3. [ ] For simplicity, annotate the videos in IPD 20.csv. In column I (cue surprise) and column J (cue joy), **mark Y or N** for the facial cue of your choice. For example, if you choose “raised eyebrows” as the cue for Surprise, and you see the person raising eyebrows in video 6 (first row), mark Y in column I (cue surprise) for the first row. 
    4. [ ] Once you finish the annotations, perform **t-test** to compare the `mean` number of votes *from the raters* for Surprise between the group you marked with Y in column I (cue surprise) and the group with N.
    5. [ ] Repeat the process to annotate and perform **t-test** for Joy
5. [ ] Report the details of your work for the previous tasks and describe your findings in a few paragraphs.*Note*: You should implement your own function to compute the Fleiss’ kappa inter annotation agreement.You can use any packages for tasks 2-4

## 1. Imports + Load Data

In [1]:
import pandas as pd

In [2]:
base = "/Users/brinkley97/Documents/development/"
class_path = "classes/csci_535_multimodal_probabilistic_learning/datasets/ipd_data/"
dataset = "tabulatedVotes.csv"
data = base + class_path + dataset 

In [3]:
def load_data(file):
    original_data = pd.read_csv(file)
    # original_data = pd.DataFrame(file)
    copy_of_data = original_data.copy()
    return copy_of_data

In [4]:
ipd_df_copy = load_data(data)
categories_df = ipd_df_copy.drop(columns=["videoID"])
categories_df

Unnamed: 0,Anger,Disgust,Fear,Joy,Neutral,Sadness,Surprise
0,0,0,0,17,0,1,2
1,0,0,1,15,3,0,1
2,2,1,2,4,10,1,0
3,0,0,0,15,1,2,2
4,1,0,0,14,2,1,2
...,...,...,...,...,...,...,...
95,0,0,0,13,2,1,4
96,0,0,1,14,1,0,4
97,0,1,0,12,4,1,2
98,2,4,0,10,1,3,0


In [19]:
ipd_20_csv = "ipd_20.csv"
ipd_20_data = base + class_path + ipd_20_csv
ipd_20_df_copy = load_data(ipd_20_data)
ipd_20_df_copy

Unnamed: 0,videoID,Anger,Disgust,Fear,Joy,Neutral,Sadness,Surprise,cue_surprise,cue_joy
0,6,0,0,0,17,0,1,2,Y,
1,173,3,1,0,6,1,1,8,Y,
2,193,0,1,0,5,6,1,7,N,
3,622,1,2,0,4,4,2,7,,
4,869,0,4,2,4,7,1,2,,
5,1489,0,0,0,7,0,2,11,,
6,1918,1,0,1,11,0,1,6,,
7,2406,0,0,1,15,0,0,4,,
8,2416,1,1,0,10,1,1,6,,
9,2889,0,1,1,9,6,1,2,,


## 2. Compute the Fleiss’ kappa

- N is the total #items/videos (rows)
- n is the #ratings per item/video = 20 [see section 3 Dataset in project description]
    - $ n_{i} $ is a single/specific item from 1...N (single row)
    - $ n_{j} $ is a single/specific category from j...k (single column)
    - $ n_{ij} $ is #raters who assigned $i$-th item to the $j$-th category
- k is the #categories into which assignments are made (columns)
- $ p_{j} $ is the proportion of all assignments which were to the $j$-th category

---

## TODOs

1. [x] Calculate $ p_{j} $
2. [x] Calculate $ P_{i} $
3. [ ] Calculate $ \bar{P} $
4. [ ] Calculate $ \bar{P_e} $
5. [ ] Calculate $ k $
6. Maybe state the numerical values for 1 - 5 in a DataFrame
    - Create DataFrame at 1, then concat at 2 - 5?

In [5]:
test_data = [
            [1, 4],
            [2, 3],
            [5, 0]
            ]
test_df = pd.DataFrame(test_data, columns=["A", "D"])
print(test_df)
test_columns = list(test_df.columns)
# print(test_columns)
test_N = len(test_df)
print()
print(test_N)
test_n = 5

   A  D
0  1  4
1  2  3
2  5  0

3


In [6]:
N = len(categories_df)
# print(N)
n = 20
# print(n)
columns = list(categories_df.columns)
# print(columns)
k = len(columns)
# print(k)

### 2.1 Calculate $ p_j $

- The output shows the proportion of all the assignments for each category.

In [7]:
def proportion_per_category(category, N, n):
    """Calculate the proportion of votes for each category
    
    Arguments:
    category -- pd Series (a specific category/column from the main DataFrame)
    N -- int (total #items/rows)
    n -- int (20 votes per item/row)
    
    Return:
    category_proportion -- float (the proportion of votes for that specific category)
    """
    
    category_values = category.values
    # print(category_values)
    sum_category_values = category_values.sum()
    # print(sum_category_values)
    category_proportion = sum_category_values / (N * n)
    # print(type(category_proportion))
    
    return category_proportion

In [8]:
def calculate_p_j(categories_df, categories, N, n):
    """Calculate p_j
    
    Arguments:
    category -- pd Series (a specific category/column from the main DataFrame)
    N -- int (total #items/rows)
    n -- int (20 votes per item/row)
    
    Functions:
    proportion_per_category()
    
    Return:
    category_proportions_df -- pd DataFrame (the proportion of votes for each category)
    
    """
    
    store_proportions = []
    
    for category in range(len(categories)):
        
        specific_category = categories[category]
        specific_category_series = categories_df[specific_category]
        
        proportion_for_specific_category = proportion_per_category(specific_category_series, N, n)
        store_proportions.append(proportion_for_specific_category)
    
        
    categories.append("p_j")
    sum_proportions = sum(store_proportions) 
    store_proportions.append(sum_proportions)
    
    category_proportions_df = pd.DataFrame([store_proportions], columns=categories)
    
    return category_proportions_df

In [9]:
# calculate_p_j(test_df, test_columns, test_N, test_n)
p_j = calculate_p_j(categories_df, columns, N, n)
p_j

Unnamed: 0,Anger,Disgust,Fear,Joy,Neutral,Sadness,Surprise,p_j
0,0.023,0.059,0.0505,0.4815,0.164,0.0575,0.1645,1.0


### 2.2 Calculate $ P_i $

- The output shows the amount the raters agree per video.

In [10]:
# test_df
# categories_df

In [11]:
def agreement_proportion_per_video(categories_df, row, n):
    """
    
    """
    
    square_votes = categories_df.iloc[row] ** 2
    # print(square_votes)
    
    sum_squared_votes = square_votes.sum()
    # print(sum_squared_votes)
    
    rater_rater_agree_pairs = sum_squared_votes - n
    # print(rater_rater_agree_pairs)
    
    all_possible_rater_rater_agree_pairs = n * (n - 1)
    # print(all_possible_rater_rater_agree_pairs)
    
    raters_agree_per_video_ratio = rater_rater_agree_pairs / all_possible_rater_rater_agree_pairs
    # print(raters_agree_per_video_ratio)
    
    return raters_agree_per_video_ratio

In [12]:
import numpy as np
def calculate_P_i(categories_df, categories, N, n):
    """
    """
    
    store_video_agree_proportions = []
    for row in range(N):
        
        agree_proportion = agreement_proportion_per_video(categories_df, row, n)
        store_video_agree_proportions.append(agree_proportion)
    
    # print(store_video_agree_proportions)
    # categories.append("P_i")
    # print(store_video_agree_proportions)
    
    video_agree_series = pd.Series(store_video_agree_proportions, name="P_i")   
    # print(video_agree_series)
    
    video_agree_df = pd.concat([categories_df, video_agree_series], axis=1)
    
    return video_agree_df

In [13]:
# calculate_P_i(test_df, test_columns, test_N, test_n)
P_i_df = calculate_P_i(categories_df, columns, N, n)
P_i_df

Unnamed: 0,Anger,Disgust,Fear,Joy,Neutral,Sadness,Surprise,P_i
0,0,0,0,17,0,1,2,0.721053
1,0,0,1,15,3,0,1,0.568421
2,2,1,2,4,10,1,0,0.278947
3,0,0,0,15,1,2,2,0.563158
4,1,0,0,14,2,1,2,0.489474
...,...,...,...,...,...,...,...,...
95,0,0,0,13,2,1,4,0.447368
96,0,0,1,14,1,0,4,0.510526
97,0,1,0,12,4,1,2,0.384211
98,2,4,0,10,1,3,0,0.289474


### 2.3 Compute $ \bar{P} $

- The output shows the mean of all $ P_i $'s

In [14]:
P_bar = (P_i_df.loc[0:, "P_i"].sum()) / N
P_bar

0.36573684210526325

### 2.4 Compute $ \bar{P_e} $

- The output shows each category being squared

In [15]:
p_j_df = p_j.drop(columns=["p_j"])
square_p_js = (p_j_df ** 2)
P_e_bar = square_p_js.iloc[0].sum()
P_e_bar

0.295665

### 2.5 Compute $ k $

- This output shows how much the raters agree/disagree on what are common emotions
- This output specifically shows that the raters disagree on what are common emotions

In [16]:
k = (P_bar - P_e_bar) / 1 - P_e_bar
k

-0.22559315789473677

## 3. My Annotations + 4. Calculate T-Test

- Independent sample t-test, because we are comparing different groups - group 1 (Y) and group 2 - (N) - at the same time

---

TODOs
1. [x] [3] Manually annotate videos; See reasoning [1]
2. [ ] [4B] Student T-Test for cue_surprise Y and N
3. [ ] [4B] Student T-Test for cue_joy Y and N

### My annotations

In [21]:
# ipd_20_df_copy

In [22]:
ipd_20_df_copy.loc[[0, 1, 2, 5, 12, 13, 14, 16, 17, 18], "cue_surprise"] = "Y"
ipd_20_df_copy.loc[[3, 4, 6, 7, 8, 9, 10, 11, 15, 19], "cue_surprise"] = "N"
ipd_20_df_copy.loc[[3, 4, 6, 7, 8, 9, 10, 11, 15, 19], "cue_joy"] = "Y"
ipd_20_df_copy.loc[[0, 1, 2, 5, 12, 13, 14, 16, 17, 18], "cue_joy"] = "N"
ipd_20_df_copy

Unnamed: 0,videoID,Anger,Disgust,Fear,Joy,Neutral,Sadness,Surprise,cue_surprise,cue_joy
0,6,0,0,0,17,0,1,2,Y,N
1,173,3,1,0,6,1,1,8,Y,N
2,193,0,1,0,5,6,1,7,Y,N
3,622,1,2,0,4,4,2,7,N,Y
4,869,0,4,2,4,7,1,2,N,Y
5,1489,0,0,0,7,0,2,11,Y,N
6,1918,1,0,1,11,0,1,6,N,Y
7,2406,0,0,1,15,0,0,4,N,Y
8,2416,1,1,0,10,1,1,6,N,Y
9,2889,0,1,1,9,6,1,2,N,Y


### T-Test

In [25]:
cues_df = ipd_20_df_copy.loc[0:, ["cue_surprise", "cue_joy"]]
cues_df

Unnamed: 0,cue_surprise,cue_joy
0,Y,N
1,Y,N
2,Y,N
3,N,Y
4,N,Y
5,Y,N
6,N,Y
7,N,Y
8,N,Y
9,N,Y


In [35]:
cue_s_count = cues_df.loc[0:, "cue_surprise"].value_counts()
print(cue_s_count)
cue_s_mean = cue_s_count.mean()
print(cue_s_mean)
cue_s_variance = cue_s_count.var()
print(cue_s_variance)

Y    10
N    10
Name: cue_surprise, dtype: int64
10.0
0.0


# References

1. [Detravious' Notion](https://detraviousjbrinkley.notion.site/HW-1-Annotation-aggregation-and-exploratory-analysis-Due-e60bdf637f5b40d98c296a7280e1b520)
    - Can use to further understand computations