In [None]:
!pip install datasets
!pip install bertopic

# Introduction (1 point)

**(1 point) Pick one of the datasets between hate and offensive, and justify your choice. Remember that it is for a commercial application (there is a good and a bad answer).**

The "hate" dataset is subject to copyright restrictions for commercial use. Given that our project has commercial implications, we are unable to utilize it. Therefore, we'll proceed with the "offensive" dataset instead.

In [14]:
from datasets import load_dataset
from bertopic import BERTopic

dataset = load_dataset("tweet_eval", "offensive")
dataset



  0%|          | 0/3 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 11916
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 860
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 1324
    })
})

# Evaluating the dataset (5 points)

**(1 point) Describe the dataset. Look at the splits, proportion of classes, and see what you can figure out by just looking at the text.**

In [15]:
train_df = dataset['train']
test_df = dataset['test']
val_df = dataset['validation']

print(train_df.shape)
print(test_df.shape)
print(val_df.shape)

print("Train data has {}% positive labels".format(sum(train_df['label'])/len(train_df)*100))
print("Test data has {}% positive labels".format(sum(test_df['label'])/len(test_df)*100))
print("Validation data has {}% positive labels".format(sum(val_df['label'])/len(val_df)*100))

(11916, 2)
(860, 2)
(1324, 2)
Train data has 33.07317891910037% positive labels
Test data has 27.906976744186046% positive labels
Validation data has 34.66767371601209% positive labels


In [16]:
for i in range(5):
    print(train_df['text'][i])

@user Bono... who cares. Soon people will understand that they gain nothing from following a phony celebrity. Become a Leader of your people instead or help and support your fellow countrymen.
@user Eight years the republicans denied obama’s picks. Breitbarters outrage is as phony as their fake president.
@user Get him some line help. He is gonna be just fine. As the game went on you could see him progressing more with his reads. He brought what has been missing. The deep ball presence. Now he just needs a little more time
@user @user She is great. Hi Fiona!
@user She has become a parody unto herself? She has certainly taken some heat for being such an....well idiot. Could be optic too  Who know with Liberals  They're all optics.  No substance



The dataset is divided into three subsets: training, validation, and testing, which is a standard approach in machine learning. Each of these subsets contains approximately 30% offensive tweets and 70% non-offensive tweets.

A cursory analysis of the text reveals frequent occurrences of the "@user" tag. However, this should not present any significant challenges for our purposes, as we'll explain subsequently.

**(3 points) Use BERTopic to extract the topics within the data, and the main topics within each class. Please, think about fixing the random seed.**

In [17]:
from umap import UMAP

topic_model = BERTopic(language="english", embedding_model="all-MiniLM-L6-v2", calculate_probabilities=True, verbose=True, nr_topics='auto', umap_model=UMAP(random_state=42))

topics, _ = topic_model.fit_transform(train_df["text"])

Batches:   0%|          | 0/373 [00:00<?, ?it/s]

2023-06-25 13:01:27,170 - BERTopic - Transformed documents to Embeddings
2023-06-25 13:01:46,380 - BERTopic - Reduced dimensionality
2023-06-25 13:01:55,814 - BERTopic - Clustered reduced embeddings
2023-06-25 13:01:56,767 - BERTopic - Reduced number of topics from 88 to 53


In [18]:
topic_model.get_topic_info()

Unnamed: 0,Topic,Count,Name,Representation,Representative_Docs
0,-1,3396,-1_she_is_the_to,"[she, is, the, to, and, user, you, of, are, he]",[@user Y’all need to know that she is human be...
1,0,3319,0_he_you_is_user,"[he, you, is, user, are, she, and, to, so, the]","[@user She is 😭😭😭, @user he is, @user She"""" is..."
2,1,1119,1_antifa_user_the_and,"[antifa, user, the, and, of, they, to, that, a...","[@user Yes that and ANTIFA., @user @user That ..."
3,2,1023,2_gun_control_guns_laws,"[gun, control, guns, laws, the, to, user, in, ...",[@user @user @user Yes we need gun control. It...
4,3,703,3_liberals_conservatives_the_they,"[liberals, conservatives, the, they, are, user...","[@user That they're liberals, @user and THEY a..."
5,4,500,4_maga_trump_wwg1wga_president,"[maga, trump, wwg1wga, president, the, to, qan...","[#MAGA much?, Now this is how we MAGA, The MAG..."
6,5,225,5_he_nfl_football_is,"[he, nfl, football, is, the, game, him, his, b...",[@user @user And he is out for game 2. He has...
7,6,163,6_brexit_uk_the_tories,"[brexit, uk, the, tories, eu, labour, tory, co...",[@user @user @user @user And there's #Brexit 👇...
8,7,152,7_user_you_treph_sanctions,"[user, you, treph, sanctions, gt, follow, lt, ...",[@user @user @user @user @user @user @user @us...
9,8,138,8_kavanaugh_judge_to_liberals,"[kavanaugh, judge, to, liberals, this, the, ma...",[@user @user @user If liberals are against Kav...


As observed, common words such as 'she', 'is', 'the', 'to', 'and', 'you', 'of', 'are', 'he' and particularly 'user', appear to be disregarded in our analysis (Topic -1). This exclusion actually benefits our study, allowing us to focus on more relevant and impactful terms.

In [19]:
topic_model.get_topic(0)

[('he', 0.023418632950577365),
 ('you', 0.021589497409464063),
 ('is', 0.020800025630667174),
 ('user', 0.01808105834217001),
 ('are', 0.016475764347177388),
 ('she', 0.016361305716226026),
 ('and', 0.01266172987269982),
 ('to', 0.0125368160112681),
 ('so', 0.011598849684100843),
 ('the', 0.011578331761238158)]

In [20]:
topics_per_class = topic_model.topics_per_class(train_df["text"], classes=train_df["label"])
topic_model.visualize_topics_per_class(topics_per_class, top_n_topics=10)

2it [00:00,  7.32it/s]


**(1 point) What do you think about the results? How do you think it could impact a model trained on these data?**

As we can see the top topics are :
  + He, she, is, you, user (they are ignored due to their ubiquitous nature and limited relevance)
  + Antifa
  + Gun, controls, laws
  + Liberals, conservatives
  + Trump, president
  + Football
  + Brexit
  + Women, sexual

 One potential issue is that the model might mistakenly regard tweets mentioning 'women' as controversial since there is as much topic containing women in label 0 and label 1.

Moreover, it's interesting to note that the most recurrent topics appear less frequently in offensive tweets. This suggests that the offensiveness of such tweets isn't primarily driven by their subject matter, but rather by the presence of disrespectful language or insults.

# Evaluate a model (6 points)

In [22]:
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
from scipy.special import softmax
import csv
import urllib.request

# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)

# Tasks:
# emoji, emotion, hate, irony, offensive, sentiment
# stance/abortion, stance/atheism, stance/climate, stance/feminist, stance/hillary

task='offensive'
MODEL = f"cardiffnlp/twitter-roberta-base-{task}"

tokenizer = AutoTokenizer.from_pretrained(MODEL)

In [23]:
labels=[]
mapping_link = f"https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/{task}/mapping.txt"
with urllib.request.urlopen(mapping_link) as f:
    html = f.read().decode('utf-8').split("\n")
    csvreader = csv.reader(html, delimiter='\t')
labels = [row[1] for row in csvreader if len(row) > 1]

In [24]:
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
model.save_pretrained(MODEL)

**(2 points) Evaluate their model on the test split of the dataset you picked, using precision, recall, and F1-score.**

In [29]:
from transformers import pipeline

# Initialize text classification pipeline with our model and tokenizer
classification_pipeline = pipeline("text-classification", model=model, tokenizer=tokenizer)

# List to store the predicted labels
predicted_labels = []
predicted_scores = []

# Loop over test dataset and make predictions
for entry in test_df:
    preprocessed_text = preprocess(entry['text'])
    prediction_output = classification_pipeline(preprocessed_text)[0]
    prediction = prediction_output['label']
    prediction = str(prediction)
    predicted_labels.append(prediction)
    score = prediction_output['score']  # get predicted score
    predicted_scores.append(score)

# Map 'offensive' to 1 and 'not-offensive' to 0 in predicted labels
predicted_labels = [1 if label == 'offensive' else 0 for label in predicted_labels]

# Extract actual labels from test dataset
actual_labels = [item['label'] for item in test_df]

In [30]:
from sklearn.metrics import precision_score, recall_score, f1_score

# Calculate Precision
precision = precision_score(actual_labels, predicted_labels)

# Calculate Recall
recall = recall_score(actual_labels, predicted_labels)

# Calculate F1 Score
f1 = f1_score(actual_labels, predicted_labels)

print("Precision: ", precision)
print("Recall: ", recall)
print("F1 Score: ", f1)

Precision:  0.7960199004975125
Recall:  0.6666666666666666
F1 Score:  0.7256235827664399


So as we can see :
 + Precision : 0.79
 + Recall : 0.66
 + F1 score : 0.72

Overall, considering the metrics of true positives, false positives, true negatives, and false negatives, the model demonstrates commendable performance.

**Look for prediction failures. Extract the top 5 misclassified tweets (highest score in wrong class) for each class and discuss what could be wrong with the model.**

In [35]:
differences = [abs(true - pred) for true, pred in zip(actual_labels, predicted_labels)]
errors = [(i, diff, score) for i, (diff, score) in enumerate(zip(differences, predicted_scores))]
errors = [error for error in errors if error[1] != 0]
errors.sort(key=lambda x: x[2], reverse=True)
top_5_error_indices = [error[0] for error in errors[:5]]

for id in top_5_error_indices:
  print("Sentence :")
  print(test_df["text"][id])
  print("Expected : {}, Got : {}, Score : {}".format(actual_labels[id], predicted_labels[id], predicted_scores[id]))

Sentence :
#Liberals / #Democrats THIS is what you stand for. If not, then #WalkAway
Expected : 1, Got : 0, Score : 0.9338217973709106
Sentence :
#Liberals Are Reaching Peak Desperation To Call On #PhillipRuddock To Talk With #Turnbull To Convince Him To Help with #WentworthVotes 18 Sept 2018  @user #Auspol #LNP #NSWpol @user  @user @user #LNPMemes
Expected : 1, Got : 0, Score : 0.919756293296814
Sentence :
#NoPasaran: Unity demo to oppose the far-right in #London – #antifa #Oct13 — Enough is Enough!
Expected : 1, Got : 0, Score : 0.9112220406532288
Sentence :
#BREXIT deal HAS been reached - and will be unveiled at special summit in NOVEMBER, Has @user sold out the #UK to the eu??? She better have not or the @user are finished!! @user
Expected : 1, Got : 0, Score : 0.9081718325614929
Sentence :
Are you fucking serious?
Expected : 0, Got : 1, Score : 0.9010689854621887


Here are the top failure :
  + #Liberals / #Democrats THIS is what you stand for. If not, then #WalkAway (Got non offensive)
  + #Liberals Are Reaching Peak Desperation To Call On #PhillipRuddock To Talk With #Turnbull To Convince Him To Help with #WentworthVotes 18 Sept 2018  @user #Auspol #LNP #NSWpol @user  @user @user #LNPMemes (Got non offensive)
  + #NoPasaran: Unity demo to oppose the far-right in #London – #antifa #Oct13 — Enough is Enough! (Got non offensive)
  + #BREXIT deal HAS been reached - and will be unveiled at special summit in NOVEMBER, Has @user sold out the #UK to the eu??? She better have not or the @user are finished!! @user (Got non offensive)
  + Are you fucking serious? (Got offensive)

The model primarily struggles with certain topics, as we can see that tweets related to politics appear to be somewhat blacklisted.

The final tweet is labeled as offensive due to the inclusion of a profanity, "fucking". The lack of sufficient context in this case seems to challenge the model's ability to accurately classify the content.


In [38]:
import numpy as np

# Convert list to numpy arrays for easier manipulation
predicted_labels = np.array(predicted_labels)
predicted_scores = np.array(predicted_scores)

# Find indices of positive and negative predictions
positive_indices = np.where(predicted_labels == 1)[0]
negative_indices = np.where(predicted_labels == 0)[0]

# Sort positive and negative indices by their respective scores
positive_indices = positive_indices[np.argsort(predicted_scores[positive_indices])[::-1]] # descending order
negative_indices = negative_indices[np.argsort(predicted_scores[negative_indices])[::-1]] # descending order

# Get top 10 confident predictions for each class
top_10_positive = positive_indices[:10]
top_10_negative = negative_indices[:10]

# For the most uncertain predictions, we consider the scores closest to 0.5
uncertainty_scores = np.abs(predicted_scores - 0.5)
uncertain_indices = np.argsort(uncertainty_scores)[:10] # ascending order

print("Positive :")
for id in top_10_positive:
  print("Sentence :")
  print(test_df["text"][id])
  print("Expected : {}, Got : {}, Score : {}".format(actual_labels[id], predicted_labels[id], predicted_scores[id]))

print("Neutral :")
for id in top_10_negative:
  print("Sentence :")
  print(test_df["text"][id])
  print("Expected : {}, Got : {}, Score : {}".format(actual_labels[id], predicted_labels[id], predicted_scores[id]))

print("Incertain")
for id in uncertain_indices:
  print("Sentence :")
  print(test_df["text"][id])
  print("Expected : {}, Got : {}, Score : {}".format(actual_labels[id], predicted_labels[id], predicted_scores[id]))

Positive :
Sentence :
@user nigga are you stupid your trash dont play with him play with your bitch 😂
Expected : 1, Got : 1, Score : 0.9518308639526367
Sentence :
#ArianaAsesina? Is that serious?! Holy shit, please your fucking assholes, don't blame someone for the death of other one. She is sad enough for today, don't you see? It isn't fault of none, he had an overdose and died. End. Stop wanting someone to blame, fuckers.
Expected : 1, Got : 1, Score : 0.9391582012176514
Sentence :
@user Damn I felt this shit. Why you so loud lol
Expected : 1, Got : 1, Score : 0.9266199469566345
Sentence :
$1500 for a phone. You all are fucking dumb.
Expected : 1, Got : 1, Score : 0.9251351952552795
Sentence :
All these sick ass ppl from school gave me something and now I have to chug down this nasty drink so it can go away🙃
Expected : 1, Got : 1, Score : 0.921259343624115
Sentence :
#SugarDaddy Retweet if you are under 30 and would like share memes with each other and we can talk about dumb shit. Cu

The model clearly excels at classifying offensive and neutral tweets, although it's fair to mention that these categories often present more straightforward cases.

The uncertain category, on the other hand, tends to encompass longer tweets, many of which discuss factual information and recurrently return to the subject of politics. It appears that when individuals offer their perspectives on political matters without resorting to offensive language, the model struggles. The confusion likely arises because politics is inherently a controversial topic, yet the absence of explicit derogatory terms in these cases makes classification more challenging.

# Annotate Data (7 points)

**(1 point) Extract about 100 tweets containing at least 20% of your target class (offensive/hateful), from the 10K tweets provided. You can use the pretrained model to help you find tweets in the target class.**

In [50]:
import json
import pandas as pd

with open('tweets.json') as f:
    data = json.load(f)
tweets = [item['text'] for item in data]

tweets_df = pd.DataFrame({'text': tweets, 'label': 0})

tweets_df['label'] = tweets_df['text'].apply(lambda x: classification_pipeline(x)[0]['label'])


In [57]:
tweets_copy = tweets_df.copy()
tweets_copy['label'] = tweets_copy['label'].apply(lambda x: 1 if x == 'offensive' else 0)
tweets_copy

Unnamed: 0,text,label
0,YOU BETTER SUCK HIS DICK KOZY I SEE YOU WITH K...,1
1,I still canr believe it.😭😭😭😭😭,0
2,You should raise the webform....how would they...,0
3,im tired too but this is so entertaining i cant,0
4,Fuckof,1
...,...,...
9995,Because It’s My Business: Hear Tabitha Brown’s...,0
9996,comer pipoca enquanto assisto girl from nowher...,0
9997,They will be mad with me if they’re not 504Boy...,1
9998,Omg so beautiful 😍😍😍,0


Finally we reduce to 100 tweets containing 20 offensants and 80 non-offensant.

In [60]:
dataframe_label1 = tweets_copy[tweets_copy['label'] == 1]
dataframe_label0 = tweets_copy[tweets_copy['label'] == 0]

label1 = dataframe_label1.sample(n=20)
label0 = dataframe_label0.sample(n=80)

# Concaténer les deux échantillons pour obtenir un nouveau DataFrame de 100 éléments
tweets_100 = pd.concat([label1, label0])

# Mélanger les lignes du DataFrame pour assurer la variété des données
tweets_100 = tweets_100.sample(frac=1).reset_index(drop=True)
tweets_100

Unnamed: 0,text,label
0,OMG so CUTE!!!😢😢,0
1,What is a piccle,0
2,I have some NFTs but this just seems kinda sad...,1
3,tempered emotion?,0
4,They look so beautiful omg ☹️💕,0
...,...,...
95,SAME I GOING TO CRY,0
96,ITS SUCH A FUN MOVIE,0
97,FUCK FUCK FUCK FUCK CALM VAML PKAY OKA6,1
98,"Done, wish me luck",0


**(3 points) Altogether, write down an annotation guildeline (which should be at least 2/3 of a page long).
What does the target class look like?
Any examples you could provide for ambiguous cases?
Keep "Can't tell / not annotable" class. Make sure you document what this class mean in your guideline.**

+ Offensive Tweets (Label : 1)
    - Direct insults.
    - Use of profanity.
    - Offensive jokes on minorities.
    - Cyberbullying or harassment.
+ Neutral (Label : 0)
    - People speaking about facts, that don't hurt anybody.
    - Positive comment toward a person.
    - Jokes as long as they does not aim to hurt a group a person.
+ Not annotable (Label : -1)
   - Not understandable tweet like another language for example.
   - The meaning of the tweet is not clear.
   - The context of the tweet is unknown


+ Examples :
  - Offensive :
    + nigga are you stupid your trash dont play with him play with your bitch
    + Why don't you just go away? No one wants you here.
    + I can't believe people like you still exist. Disgusting.
  - Not offensive :
    + You’re amazing just the way you are 💟☮️
    + What a beautiful day! The sun is shining and the birds are singing.
    + Just finished a workout session.
  - Not annotable :
    + Dieser Sonnenuntergang ist wirklich schön.
    + OMG bffl, idk wtg. lol #yolo
    + Yeah, that's exactly what I'm talking about.



**(1 point) Every person in your group is going to annotate these tweets separately. So if you are 3, annotate them 3 times.
Typically, create a Google sheet or an excel document, one tab per person, in each tab one column for the text, and annother on the class.**

**(2 point) Evaluate your inter-annotaor agreement using Fleiss Kappa.
statsmodel provide an easy to use implementation.
What does the score mean? Are you doing a good job annotating the data and, if not, why?**