This notebook computes the most prevalent value in each NLP task and the accuracy of ChatGPT annotations based on the human annotations.

* Input: 'AgoraSpeech_preprocessed.csv'
* Output: prints
* Actions: 
    1. Finds and prints the most frequent value along with the percentage of paragraphs for each NLP task
    2. Finds and prints the accuracy of the ChatGPT annotations compared to the human annotations 

In [1]:
import pandas as pd

In [2]:
# read data
data = pd.read_csv('AgoraSpeech_preprocessed.csv')

In [3]:
tasks = ['criticism_or_agenda', 'topic', 'sentiment', 'polarization', 'populism']
for task in tasks:
    if task == 'sentiment' or task == 'polarization' or task == 'populism':
        gpt = data[task + '_gpt_category']
        human = data[task + '_human_category']
    else:
        gpt = data[task + '_gpt']
        human = data[task + '_human']

    # find and print the most frequent value along with the percentage of paragraphs
    print(f'{human.value_counts().idxmax()} {round(human.value_counts().max()/len(data)*100,0)} %')

    # find and print the accuracy of the ChatGPT annotations compared to the human annotations 
    # (as the proportion of paragraphs where ChatGPT's annotation remains unchanged by human reviewers, relative to the total number of paragraphs)
    equal = gpt == human
    print(f'{task}: {round(sum(equal)/len(data)*100,0)} %')
    print('\n')

political agenda 61.0 %
criticism_or_agenda: 89.0 %


elections 25.0 %
topic: 61.0 %


positive 40.0 %
sentiment: 93.0 %


low 88.0 %
polarization: 88.0 %


low 97.0 %
populism: 93.0 %


