# STATE TWITTER TROLL DETECTION USING TRANSFORMERS


## REPO STRUCTURE

* Notebooks 1.0 - 1.2: Data collection, cleaning and preparation. Optional if you just want to experiment with the final dataset.

* Notebook 1.3: Setting a baseline with Hugging Face's Zero-shot Classifier.

* Notebooks 2.0 - 2.1: Finetuning distilbert with my custom dataset and detailed testing with unseen validation dataset.

* app.py + folders for "static" and "template: simple app for use on a local machine to demonstrate how a state troll tweet detector can be used in deployment. Unfortunately free hosting accounts can't accomodate the disk size required for pytorch and the fine tuned model, so I've not deployed this online. 

# PART 1D: SETTING BASELINE PERFORMANCE WITH ZERO-SHOT CLASSIFIER

In "classic" machine learning approaches, a simple baseline performance for a model can be assessed via the dummy_classifier/regressor feature. There isn't something equivalent for transformer models, unfortunately.

The recently introduced [zero-shot classifier](https://discuss.huggingface.co/t/new-pipeline-for-zero-shot-text-classification/681) by Hugging Face is a possible option, in my view, though I'm pretty sure this wasn't the original intention for the feature.

In any case, this is a pretty quick way to get a sense of how a transformer model that's not been finetuned on this particular task - not an easy one, in my view - performs.

I'll just test the zero-shot classifier against 10% of the validation data, or 1,000 rows.

In [None]:
from __future__ import print_function

import ipywidgets as widgets
import numpy as np
import pandas as pd
import re
from transformers import pipeline

In [2]:
tweets = (
    pd.read_csv("../data/validate.csv")
    .dropna(subset=["clean_text"])
    .sample(n=1000, random_state=42, replace=False)
)

In [3]:
# The pipeline assumes by default that only one of the candidate labels is true, 
# returning a list of scores for each label which add up to 1.

corpus = list(tweets['clean_text'].values)

candidate_labels = ["real_tweet", "troll_tweet"]

In [None]:
classifier = pipeline("zero-shot-classification")

In [5]:
%%time

tweets["HF_pred"] = classifier(corpus, candidate_labels)

CPU times: user 18min 30s, sys: 47.1 s, total: 19min 17s
Wall time: 6min 20s


In [6]:
tweets.head()

Unnamed: 0,tweetid,user_display_name,tweet_text,clean_text,troll_or_not,HF_pred
6252,620920600157138944,derrickmc,"Teenage girl fucked, I Want To Have Sex In Hig...",Teenage girl fucked I Want To Have Sex In High...,1,{'sequence': 'Teenage girl fucked I Want To Ha...
4684,1296895872639348738,TheEconomist,"In the absence of understanding, doctors must ...",In the absence of understanding doctors must f...,0,{'sequence': 'In the absence of understanding ...
1731,1053507862956294150,JiayangFan,@NastyGalHelp @NastyGal Just sent a DM. Hope s...,Just sent a DM Hope someone can respond soonest,0,{'sequence': 'Just sent a DM Hope someone can ...
4742,650861613516324864,Room Of Rumor,"Ichiro takes mound for Marlins, Phillies win ...",Ichiro takes mound for Marlins Phillies win #s...,1,{'sequence': 'Ichiro takes mound for Marlins P...
4521,577881696416108544,5838c3c419e0a51b6af6d63faad6688de4ac7a6f74fbba...,It is better to have less thunder in the mouth...,It is better to have less thunder in the mouth...,1,{'sequence': 'It is better to have less thunde...


In [7]:
# let's extract the individual labels and scores for a clearer look

tweets['Pred_Label1'] = [x.get('labels')[0] for x in tweets['HF_pred']]
tweets['Pred_Label2'] = [x.get('labels')[1] for x in tweets['HF_pred']]

tweets['Pred_Score1'] = [x.get('scores')[0] for x in tweets['HF_pred']]
tweets['Pred_Score2'] = [x.get('scores')[1] for x in tweets['HF_pred']]

###  NOTE: The zero-shot classifier will always show the higher score first, meaning the first label and first score are the predicted values.

In [8]:
#keep only cols needed for clearer look at results

cols = ['troll_or_not', 'clean_text','Pred_Label1', 'Pred_Score1', 'Pred_Label2', 'Pred_Score2']

tweets_baseline = tweets[cols].copy()

In [9]:
tweets_baseline.head()

Unnamed: 0,troll_or_not,clean_text,Pred_Label1,Pred_Score1,Pred_Label2,Pred_Score2
6252,1,Teenage girl fucked I Want To Have Sex In High...,real_tweet,0.535819,troll_tweet,0.464181
4684,0,In the absence of understanding doctors must f...,real_tweet,0.640341,troll_tweet,0.359659
1731,0,Just sent a DM Hope someone can respond soonest,troll_tweet,0.53171,real_tweet,0.46829
4742,1,Ichiro takes mound for Marlins Phillies win #s...,real_tweet,0.847504,troll_tweet,0.152496
4521,1,It is better to have less thunder in the mouth...,real_tweet,0.511084,troll_tweet,0.488916


In [10]:
# re-labelling the origins of the tweets (trolls or real) for clarity and comparison
# as noted earlier, troll tweets were labelled 1 and real ones as 0

tweets_baseline['Status'] = np.where(tweets_baseline['troll_or_not'] == 1, "troll_tweet", "real_tweet")

In [11]:
tweets_baseline["Compare_Results"] = np.where(
    tweets_baseline["Status"] == tweets_baseline["Pred_Label1"],
    "correct prediction",
    "wrong prediction",
)


In [12]:
tweets_baseline['Compare_Results'].value_counts()

correct prediction    505
wrong prediction      495
Name: Compare_Results, dtype: int64

In [13]:
tweets_baseline.head(20)

Unnamed: 0,troll_or_not,clean_text,Pred_Label1,Pred_Score1,Pred_Label2,Pred_Score2,Status,Compare_Results
6252,1,Teenage girl fucked I Want To Have Sex In High...,real_tweet,0.535819,troll_tweet,0.464181,troll_tweet,wrong prediction
4684,0,In the absence of understanding doctors must f...,real_tweet,0.640341,troll_tweet,0.359659,real_tweet,correct prediction
1731,0,Just sent a DM Hope someone can respond soonest,troll_tweet,0.53171,real_tweet,0.46829,real_tweet,wrong prediction
4742,1,Ichiro takes mound for Marlins Phillies win #s...,real_tweet,0.847504,troll_tweet,0.152496,troll_tweet,wrong prediction
4521,1,It is better to have less thunder in the mouth...,real_tweet,0.511084,troll_tweet,0.488916,troll_tweet,wrong prediction
6340,1,#sports Best postgame Cavs quotes after Game l...,real_tweet,0.875354,troll_tweet,0.124646,troll_tweet,wrong prediction
576,1,null It is UTC now,real_tweet,0.546614,troll_tweet,0.453386,troll_tweet,wrong prediction
5202,0,You didnt tell him that before the flight,real_tweet,0.558011,troll_tweet,0.441989,real_tweet,correct prediction
6363,0,We are livechatting A cast including,real_tweet,0.62695,troll_tweet,0.37305,real_tweet,correct prediction
439,1,Want a cheap flight to Miami Heres how to make...,real_tweet,0.666553,troll_tweet,0.333447,troll_tweet,wrong prediction


## NOTE:

Looks like the zero-shot classifier managed to label just about half of the tweets correctly. Let's see how distilbert performs after fine tuning with our custom dataset.