#### Author
Yury Kashnitsky

#### Reference
[Notion ticket](https://www.notion.so/a74951e4e815480584dea7d61ddce6cc?v=dbfdb1207d0e451b827d3c5041ed0cfd&p=e2e0f4f076c54dd6b1c82ba66e277fad)

#### Idea
Measure annotator agreement for 400 news titles sampled previously [here](../20220413_scrape_bitcointicker_news_prepare_batches_for_annotation.ipynb)

#### Data
Two batches coming from Amazon Mechanical Turk. [Data](https://drive.google.com/file/d/1e77NWDtm38tSpYxoZfJbC1UXLAizv9wL/view?usp=sharing) (pwd-protected)

#### Result
Annotator accuracy as measured against the majority vote ranges from 81% to 90% while Tf-Idf & logreg baseline hits 69% only.  

In [1]:
import pandas as pd
pd.set_option('display.max.columns', 50)

In [2]:
df = pd.concat([pd.read_csv('../data/20220417_Batch_353286_trial1_results.csv', index_col='HITId'),
               pd.read_csv('../data/20220417_Batch_353289_trial2_results.csv', index_col='HITId')])

In [3]:
df.columns

Index(['HITTypeId', 'Title', 'Description', 'Keywords', 'Reward',
       'CreationTime', 'MaxAssignments', 'RequesterAnnotation',
       'AssignmentDurationInSeconds', 'AutoApprovalDelayInSeconds',
       'Expiration', 'NumberOfSimilarHITs', 'LifetimeInSeconds',
       'AssignmentId', 'WorkerId', 'AssignmentStatus', 'AcceptTime',
       'SubmitTime', 'AutoApprovalTime', 'ApprovalTime', 'RejectionTime',
       'RequesterFeedback', 'WorkTimeInSeconds', 'LifetimeApprovalRate',
       'Last30DaysApprovalRate', 'Last7DaysApprovalRate', 'Input.content',
       'Answer.sentiment', 'Approve', 'Reject'],
      dtype='object')

In [4]:
df.head(2)

Unnamed: 0_level_0,HITTypeId,Title,Description,Keywords,Reward,CreationTime,MaxAssignments,RequesterAnnotation,AssignmentDurationInSeconds,AutoApprovalDelayInSeconds,Expiration,NumberOfSimilarHITs,LifetimeInSeconds,AssignmentId,WorkerId,AssignmentStatus,AcceptTime,SubmitTime,AutoApprovalTime,ApprovalTime,RejectionTime,RequesterFeedback,WorkTimeInSeconds,LifetimeApprovalRate,Last30DaysApprovalRate,Last7DaysApprovalRate,Input.content,Answer.sentiment,Approve,Reject
HITId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1
3WYZV0QBFJLDHXPSRKVEGRG4ANEBXS,38EEJYEY9O9OX0A7G3JZLH9BP4310L,Sentiment analysis of news on Bitcoin,Analyze the sentiment of the provided content ...,"sentiment, news, bitcoin",$0.00,Wed Apr 13 05:12:15 PDT 2022,3,BatchId:353286;OriginalHitTemplateId:921332617;,1209600,604800,Thu Apr 28 05:12:15 PDT 2022,,,384PI804XT96OCYCPWX4MMY9YK60S2,A3KE73FIKEXFG9,Submitted,Sat Apr 16 06:34:15 PDT 2022,Sat Apr 16 06:35:38 PDT 2022,Sat Apr 23 06:35:38 PDT 2022,,,,83,100% (367/367),100% (120/120),100% (120/120),Why Isn't Bitcoin (BTC) Doing Better In Light ...,Negative,,
3WYZV0QBFJLDHXPSRKVEGRG4ANEBXS,38EEJYEY9O9OX0A7G3JZLH9BP4310L,Sentiment analysis of news on Bitcoin,Analyze the sentiment of the provided content ...,"sentiment, news, bitcoin",$0.00,Wed Apr 13 05:12:15 PDT 2022,3,BatchId:353286;OriginalHitTemplateId:921332617;,1209600,604800,Thu Apr 28 05:12:15 PDT 2022,,,3HMVI3QICK03RNV3KLTTID5KKXW1Y8,A37V9Y0507BD97,Submitted,Thu Apr 14 12:32:34 PDT 2022,Thu Apr 14 12:33:09 PDT 2022,Thu Apr 21 12:33:09 PDT 2022,,,,35,100% (85/85),100% (85/85),100% (85/85),Why Isn't Bitcoin (BTC) Doing Better In Light ...,Negative,,


In [5]:
len(df)

1193

In [6]:
df['WorkerId'].value_counts()

A3KE73FIKEXFG9    375
A3SNZUC6ZCTWFF    251
A37V9Y0507BD97    249
ARZUW61D1HIHW     215
AUJFPBJZCAO6C     102
AK8BJE69LD69K       1
Name: WorkerId, dtype: int64

In [7]:
df['WorkerId'] = df['WorkerId'].map({'A3KE73FIKEXFG9': 'Yury' , 'A37V9Y0507BD97': 'Lina' , 
                                         'A3SNZUC6ZCTWFF': 'Victor', 'ARZUW61D1HIHW': 'Senya', 
                                        'AUJFPBJZCAO6C': 'Nikita', 'AK8BJE69LD69K': 'Unknown'})

In [8]:
ANNOTATORS = ['Lina', 'Nikita', 'Senya', 'Victor', 'Yury']

In [9]:
df = df[df['WorkerId'] != 'Unknown']

Distribution by annotators

In [10]:
df['WorkerId'].value_counts()

Yury      375
Victor    251
Lina      249
Senya     215
Nikita    102
Name: WorkerId, dtype: int64

Crosstable with all annotations by title

In [11]:
result_df = pd.crosstab(index=df['Input.content'], columns=df['WorkerId'], 
                        values=df['Answer.sentiment'], 
                        aggfunc=lambda x: x # weird, needs this aggfunc to be not None
                       )

In [12]:
result_df.head()

WorkerId,Lina,Nikita,Senya,Victor,Yury
Input.content,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
"""Not as Anonymous as People Think,"" Shows DOJ Bitcoin Recovery of $3.6 Billion",,Positive,,Neutral,Positive
'Bitcoin Centrist' Muneeb Ali: Why NFTs Are Triggering to Normies,,Neutral,Neutral,,Neutral
"19 millionth Bitcoin to be mined, now only 2 million remain unmined: Cointelegraph",Positive,Negative,,,Neutral
3 reasons why Bitcoin can rally back to $60K despite erasing last week's gains,,,Neutral,Positive,Positive
"6,000 Bitcoin Was Just Transferred From Gemini To Coinbase",,,Neutral,Positive,Neutral


In [13]:
result_df['num_votes'] = result_df[ANNOTATORS].notnull().sum(axis=1)
result_df['unique_labels'] = result_df[ANNOTATORS].apply(
    lambda x: sorted([el for el in set(x) if not pd.isnull(el)]), 
    axis=1).values

Getting majority votes and controversial examples

In [14]:
result_df['majority'] = result_df[ANNOTATORS].mode(axis=1)[0]
result_df.loc[(result_df['num_votes'] == 3) & (result_df['unique_labels'].apply(len) == 3), 'majority'] = 'Controversial'

In [15]:
result_df.head(10)

WorkerId,Lina,Nikita,Senya,Victor,Yury,num_votes,unique_labels,majority
Input.content,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"""Not as Anonymous as People Think,"" Shows DOJ Bitcoin Recovery of $3.6 Billion",,Positive,,Neutral,Positive,3,"[Neutral, Positive]",Positive
'Bitcoin Centrist' Muneeb Ali: Why NFTs Are Triggering to Normies,,Neutral,Neutral,,Neutral,3,[Neutral],Neutral
"19 millionth Bitcoin to be mined, now only 2 million remain unmined: Cointelegraph",Positive,Negative,,,Neutral,3,"[Negative, Neutral, Positive]",Controversial
3 reasons why Bitcoin can rally back to $60K despite erasing last week's gains,,,Neutral,Positive,Positive,3,"[Neutral, Positive]",Positive
"6,000 Bitcoin Was Just Transferred From Gemini To Coinbase",,,Neutral,Positive,Neutral,3,"[Neutral, Positive]",Neutral
90% of Americans Live 5 Miles From a Bitcoin ATM,,Neutral,,Positive,Negative,3,"[Negative, Neutral, Positive]",Controversial
"A New World Monetary Order is Emerging, and Bitcoin is Poised to Be a Part of It",Positive,,Positive,,Positive,3,[Positive],Positive
"A retest is expected, but most analysts expect Bitcoin price to extend much higher",Positive,,Positive,,Positive,3,[Positive],Positive
"Abra CEO Bullish on Ethereum, Predicts ETH Could Hit $40000 â€“ Markets and Prices Bitcoin News",,,Positive,Neutral,Positive,3,"[Neutral, Positive]",Positive
"Abra CEO Explains Why He Thinks Bitcoin Is Going to $250,000 and Ethereum to $40,000",Positive,,,Positive,Positive,3,[Positive],Positive


Controversial cases

In [16]:
result_df.loc[result_df['majority'] == 'Controversial']

WorkerId,Lina,Nikita,Senya,Victor,Yury,num_votes,unique_labels,majority
Input.content,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"19 millionth Bitcoin to be mined, now only 2 million remain unmined: Cointelegraph",Positive,Negative,,,Neutral,3,"[Negative, Neutral, Positive]",Controversial
90% of Americans Live 5 Miles From a Bitcoin ATM,,Neutral,,Positive,Negative,3,"[Negative, Neutral, Positive]",Controversial
Abra's Barhydt: Ethereum's 'Merge' only 'hardens Bitcoin's use case',Positive,Neutral,,Negative,,3,"[Negative, Neutral, Positive]",Controversial
Anonymous Bitcoin Whale Just Moved $35M Worth Of BTC Off Coinbase,Positive,,Neutral,,Negative,3,"[Negative, Neutral, Positive]",Controversial
Bitcoin hovers at $43K on Wall Street open amid growing fever over Terra's $3B BTC buy-in,Negative,,,Neutral,Positive,3,"[Negative, Neutral, Positive]",Controversial
"Bitcoin pulls back to around $39,000 following Wednesday surge",,Positive,,Negative,Neutral,3,"[Negative, Neutral, Positive]",Controversial
"Crypto markets slow down; Bitcoin trades over $40,000",,,Negative,Neutral,Positive,3,"[Negative, Neutral, Positive]",Controversial
"Hacker reveals he has $7 billion in Bitcoin, nearly 200000 BTC coins",,,Neutral,Negative,Positive,3,"[Negative, Neutral, Positive]",Controversial


For fun, let's run model predictions as well

In [17]:
import joblib

model = joblib.load('../models/tf-idf-logreg-baseline.joblib')

In [18]:
result_df['model_pred'] = model.predict(result_df.index)

Annotator accuracy as compared to the majority vote.

In [19]:
for annotator in ANNOTATORS + ['model_pred']:
    num_correct = (result_df[annotator] == result_df['majority']).sum()
    num_total = result_df[annotator].notnull().sum()
    acc = round(num_correct / num_total, 3)
    print(f'{acc} is the accuracy for {annotator}.')

0.88 is the accuracy for Lina.
0.824 is the accuracy for Nikita.
0.898 is the accuracy for Senya.
0.813 is the accuracy for Victor.
0.893 is the accuracy for Yury.
0.692 is the accuracy for model_pred.
