The goal of this notebook is to make some qualitative conclusion about the nature of errors our model makes. 

Findings:
- The vocabulary of false negatives (527 unique tokens) is more various than that of false positives (275 unique tokens), while their volume is comparable (860 vs 813 tokens). It may indicate that the model is "cautious" and prefers to highlight sure options, while human annotators are more creative in their analysis. 
- The most frequent false positive words characterize incompetence or lack of mental capacities: "stipid", "idiot", "ignorant", "moron", "dumb", etc. Other frequent false positives are derogatory ("pathetic", "ridiculous", "ass", "garbage", "loser", etc.), denounce particular misdeeds ("liar", "troll", "racist", "hypocrite", etc.), or express general negativity ("damn", "fuck", etc.). It's not obvious why human annotators label them as toxic in some cases, and as non-toxic in other cases. We suspect that inter-annotator agreement on such words is low. 
- The most frequent false negative words are mostly function words: "and", "the", "are", "a", "you" etc. It happens because annotators sometimes label the whole text or a large chunk of it as toxic. The more meaningful false negatives belong to the same classes as the false positives ("ignorant", "racist", "loser", etc.). 


In [2]:
import os
import numpy as np
import pandas as pd
from ast import literal_eval
import re
import nltk
import matplotlib.pyplot as plt
from nltk.tokenize import word_tokenize

path = 'data/'

In [3]:
trial = pd.read_csv(path + 'tsd_trial.csv')
train = pd.read_csv(path + 'tsd_train.csv')
final_test = pd.read_csv(path + 'tsd_test_gt.csv')

train['spans'] = train.spans.apply(literal_eval)
trial['spans'] = trial.spans.apply(literal_eval)
final_test['spans'] = final_test.spans.apply(literal_eval)
trial.shape, train.shape, final_test.shape

((690, 2), (7939, 2), (2000, 2))

In [4]:
final_pred = pd.read_csv('results/spans-pred-best-sk.txt', sep='\t', header=None)
final_pred.columns = ['idx', 'spans']
final_pred['spans'] = final_pred.spans.apply(literal_eval)
final_pred.shape

(2000, 2)

In [5]:
final_test['pred_spans'] = final_pred.spans

This is the document accuracy

In [6]:
print((final_pred.spans == final_test.spans).mean())

0.544


In [7]:
import numpy as np
from semeval2021 import f1

In [8]:
import spans_utils
from importlib import reload
reload(spans_utils)
from spans_utils import display_spans, spans2labels, labels2spans

In [9]:
print(np.mean([f1(p, y) for p, y in zip(final_pred.spans, final_test.spans)]))

0.6832488454134622


- ABNORMAL
+ bigot
+ ignorant 
+ ignorant 
- dishonest people
- dump  # Perhaps we should dump you in Mexico along with all the illegal trash you let in
- trash 
+ morons
- Trump  # Trump troll...
- Bunch of  # Bunch of cowards
- anti-Canadian # Rabidly anti-Canadian troll. 
+ troll 
+ moron 
+ ass  # Chris Birch is a mean, self-centered, contrary ass.
+ dirty # I always knew you were a dirty cop.
- benighted 
~ simpleton -> ton  # poor decoding
- racist, misgynistic
+ liar # a racist, misgynistic, liar
~ amok -> am # poor decoding
- Total # Total rubbish
+ damn 
- dumb 
- ignorant blockheads
+ lying  # weak, lying, racist whiners
- whiners
~ PATHETIC LIB LOSER -> ET LOS
~ dopey -> pey

In [10]:
total = 0
for i, row in final_test.iterrows():
    if row.spans != row.pred_spans:
        display_spans(row.spans, row.text)
        display_spans(row.pred_spans, row.text)
        print()
        total += 1
        if total >= 20:
            break





























































In [11]:
def get_substrings(character_ids, text):
    """ extract contiguous spans from text given by character_ids"""
    if not character_ids:
        return []
    prev = -100
    w = []
    result = []
    for idx in character_ids:
        if prev >=0 and idx > prev + 1:
            result.append(''.join(w))
            w = []
        if idx < len(text):
            w.append(text[idx])
        prev = idx
    if w:
        result.append(''.join(w))
    return result

get_substrings([2, 3, 5, 6, 7], 'abcdefghijkl')

['cd', 'fgh']

In [12]:
from collections import Counter
fp_count = Counter()
fn_count = Counter()

for i, row in final_test.iterrows():
    if row.spans != row.pred_spans:
        missed = sorted(set(row.spans).difference(set(row.pred_spans)))
        extra = sorted(set(row.pred_spans).difference(set(row.spans)))
        
        #if len(row.spans) > 0.5 * len(row.text) and len(row.spans) > 20:
            # people just highlight the whole text as toxic
        #    continue
        
        if extra:
            fp_count.update(get_substrings(extra, row.text))
        if missed:
            fn_count.update(get_substrings(missed, row.text))

In [13]:
len(fp_count), sum(fp_count.values()), len(fn_count), sum(fn_count.values())

(300, 777, 378, 460)

In [14]:
fp_count.most_common(20)

[('stupid', 92),
 ('idiot', 38),
 ('ignorant', 25),
 ('idiots', 23),
 ('moron', 22),
 ('dumb', 20),
 ('fool', 19),
 ('stupidity', 16),
 ('pathetic', 12),
 ('ridiculous', 10),
 ('fools', 10),
 ('troll', 9),
 ('garbage', 9),
 ('crap', 9),
 ('ass', 8),
 ('liar', 8),
 (' ', 8),
 ('damn', 7),
 ('loser', 7),
 ('racist', 6)]

In [15]:
fn_count.most_common(20)

[('ignorant', 10),
 ('racist', 8),
 ('crap', 6),
 ('loser', 6),
 ('est', 5),
 (' and ', 5),
 ('S', 4),
 ('id', 4),
 ('dumb', 3),
 ('garbage', 3),
 ('st', 3),
 ('ST', 3),
 (', ', 3),
 ('vagina', 3),
 ('sexual predator', 3),
 ('bullshit', 3),
 ('trash', 2),
 ('sucks', 2),
 ('t', 2),
 ('disgusting', 2)]

In [16]:
fp_word_count = Counter()
fn_word_count = Counter()
for span, n in fp_count.items():
    for w in span.lower().split():
        if w:
            fp_word_count[w] += n
for span, n in fn_count.items():
    for w in span.lower().split():
        if w:
            fn_word_count[w] += n
            
len(fp_word_count), sum(fp_word_count.values()), len(fn_word_count), sum(fn_word_count.values())

(275, 813, 527, 860)

In [17]:
fp_word_count.most_common(30)

[('stupid', 96),
 ('idiot', 40),
 ('ignorant', 31),
 ('moron', 24),
 ('dumb', 24),
 ('idiots', 23),
 ('fool', 21),
 ('pathetic', 18),
 ('stupidity', 16),
 ('ridiculous', 12),
 ('ass', 10),
 ('liar', 10),
 ('garbage', 10),
 ('loser', 10),
 ('fools', 10),
 ('troll', 9),
 ('crap', 9),
 ('damn', 8),
 ('fuck', 8),
 ('clown', 7),
 ('losers', 6),
 ('suck', 6),
 ('racist', 6),
 ('trash', 6),
 ('hypocrite', 6),
 ('jerk', 6),
 ('morons', 6),
 ('dumber', 6),
 ('scum', 6),
 ('ignorance', 6)]

In [18]:
fn_word_count.most_common(30)

[('and', 18),
 ('the', 13),
 ('are', 13),
 ('a', 13),
 ('ignorant', 11),
 ('racist', 10),
 ('you', 9),
 ('in', 9),
 ('is', 9),
 ('to', 8),
 ('of', 7),
 ('have', 7),
 ('loser', 7),
 ('st', 7),
 ('crap', 6),
 ('all', 6),
 ('chemical', 6),
 ('s', 5),
 ('est', 5),
 ('id', 5),
 ('that', 5),
 ('not', 5),
 ('bunch', 4),
 ('dumb', 4),
 ('disgusting', 4),
 ('this', 4),
 ('f', 4),
 ('white', 4),
 ('he', 4),
 ('incompetent', 4)]

In [30]:
fptop = [w[0] for w in fp_word_count.most_common(20)]
fntop = [w[0] for w in fn_word_count.most_common(20)]

for p0, p1, p2, p3 in zip (fptop[:10], fptop[10:], fntop[:10], fntop[10:]):
    print(f'{p0} & {p1} & {p2} & {p3} \\\\')

stupid & ass & and & of \\
idiot & liar & the & have \\
ignorant & garbage & are & loser \\
moron & loser & a & st \\
dumb & fools & ignorant & crap \\
idiots & troll & racist & all \\
fool & crap & you & chemical \\
pathetic & damn & in & s \\
stupidity & fuck & is & est \\
ridiculous & clown & to & id \\


In [19]:
for i, row in final_test.iterrows():
    if row.spans != row.pred_spans:
        missed = sorted(set(row.spans).difference(set(row.pred_spans)))
        extra = sorted(set(row.pred_spans).difference(set(row.spans)))
        extra_subs = get_substrings(extra, row.text)
        missed_subs = get_substrings(missed, row.text)
        if any('chemical' in p for p in missed_subs):
            display_spans(row.spans, row.text)
            display_spans(row.pred_spans, row.text)

Some texts are labelled as toxic as a whole. We exclude them from error analysis.

In [20]:
for i, row in final_test.iterrows():
    if row.spans != row.pred_spans:
        if len(row.spans) > 0.5 * len(row.text) and len(row.spans) > 20:
            display_spans(row.spans, row.text)
            display_spans(row.pred_spans, row.text)

# Examples of texts

Successful examples

In [26]:
total = 0
for i, row in final_test.iterrows():
    if row.spans == row.pred_spans:
        display_spans(row.spans, row.text)
        display_spans(row.pred_spans, row.text)
        print()
        total += 1
        if total >= 100:
            break













































































































































































































































































































Bad examples

In [24]:
total = 0
for i, row in final_test.iterrows():
    if row.spans != row.pred_spans:
        display_spans(row.spans, row.text)
        display_spans(row.pred_spans, row.text)
        print()
        total += 1
        if total >= 20:
            break



























































