# TweetNLP Implementation
This colab notebook is an adaptation of the demo notebook for TweetNLP, in which the developers provide a short introduction of [`tweetnlp`](https://github.com/cardiffnlp/tweetnlp), a python library of NLP models for tweets. 

For my project, I am going to be using the topic classification, offensive langugae, irony, and emotion classification models. So, I am going to focus on these areas as I demonstrate their function and test their accuracy.

In order to test the accuracy of these various tasks, I am going to use the testing datasets from the parent paper and evaluate each task by F1 measure, as they did in the parent paper.

# Load Datasets and Packages

In [52]:
import pandas as pd

import sklearn
from sklearn.metrics import f1_score

#Emotion Classification Dataset
e_url = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/emotion/test_text.txt"
emotions_txt = pd.read_csv(e_url, delimiter = "\n", header = None)
e_url2 = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/emotion/test_labels.txt"
emotions_true = pd.read_csv(e_url2, header = None)

#Irony Detection Dataset
i_url = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/irony/test_text.txt"
irony_txt = pd.read_csv(i_url, delimiter = "\n", header = None)
i_url2 = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/irony/test_labels.txt"
irony_true = pd.read_csv(i_url2, header = None)

#Offensive Language Detection Dataset
o_url = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/offensive/test_text.txt"
offensive_txt = pd.read_csv(o_url, delimiter = "\n", header = None)
o_url2 = "https://raw.githubusercontent.com/cardiffnlp/tweeteval/main/datasets/offensive/test_labels.txt"
offensive_true = pd.read_csv(o_url2, header = None)

#Topic Classification Dataset
#topic_url = ""
#topic_txt = pd.read_csv(topic_url)
#topic_url2 = ""
#topic_true = pd.read_csv(topic_url2)

## Installation
TweetNLP is available on pip or can be installed from source.


In [2]:
# Fix Colab Error
!pip install --upgrade google-cloud-storage

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting google-cloud-storage
  Downloading google_cloud_storage-2.6.0-py2.py3-none-any.whl (105 kB)
[K     |████████████████████████████████| 105 kB 5.1 MB/s 
Collecting google-resumable-media>=2.3.2
  Downloading google_resumable_media-2.4.0-py2.py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 4.5 MB/s 
Collecting google-cloud-core<3.0dev,>=2.3.0
  Downloading google_cloud_core-2.3.2-py2.py3-none-any.whl (29 kB)
Collecting google-crc32c<2.0dev,>=1.0
  Downloading google_crc32c-1.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32 kB)
Installing collected packages: google-crc32c, google-resumable-media, google-cloud-core, google-cloud-storage
  Attempting uninstall: google-resumable-media
    Found existing installation: google-resumable-media 0.4.1
    Uninstalling google-resumable-media-0.4.1:
      Successfully uninstalled google-resumable-med

In [3]:
# via pip
!pip install tweetnlp

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tweetnlp
  Downloading tweetnlp-0.1.2.tar.gz (25 kB)
Collecting allennlp
  Downloading allennlp-2.10.1-py3-none-any.whl (730 kB)
[K     |████████████████████████████████| 730 kB 7.4 MB/s 
Collecting urlextract
  Downloading urlextract-1.7.1-py3-none-any.whl (20 kB)
Collecting transformers
  Downloading transformers-4.24.0-py3-none-any.whl (5.5 MB)
[K     |████████████████████████████████| 5.5 MB 59.4 MB/s 
[?25hCollecting sentence_transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[K     |████████████████████████████████| 85 kB 4.4 MB/s 
Collecting wandb<0.13.0,>=0.10.0
  Downloading wandb-0.12.21-py2.py3-none-any.whl (1.8 MB)
[K     |████████████████████████████████| 1.8 MB 54.1 MB/s 
Collecting traitlets>5.1.1
  Downloading traitlets-5.5.0-py3-none-any.whl (107 kB)
[K     |████████████████████████████████| 107 kB 69.7 MB/s 
[?25hCollecting tensorboar

In [4]:
! pip list | grep tweetnlp

tweetnlp                      0.1.2


All you need is to import `tweetnlp` !

In [5]:
import tweetnlp

## Tweet/Sentence Classification
The classification module originally consisted of six different tasks (Sentiment Analysis, Irony Detection, Hate Detection, Offensive Detection, Emoji Prediction, and Emotion Analysis).

In my project, I am going to be mainly using the emotion recognition and topic classification tasks to conduct my research. I am also going to attempt to utilize the hate, irony, and offensive language detection tasks and evaluate how this additional information is useful in my research. I will demonstrate the utility of each task and the accuracy of the results using a sample of the datasets used in the parent paper. 

In each example, the model is instantiated by `tweetnlp.load("task-name")`, and run the prediction by giving a text or a list of 
texts.

### Topic Classification
The aim of this task is, given a tweet to assign topics related to its content. The task is formed as a supervised multi-label classification problem where each tweet is assigned one or more topics from a total of 19 available topics. The topics were carefully curated based on Twitter trends with the aim to be broad and general and consist of classes such as: arts and culture, music, or sports. Our internally-annotated dataset contains over 10K manually-labeled tweets.

Sample Topic Classification Demonstration

In [7]:
model = tweetnlp.load('topic_classification')  # Or `model = tweetnlp.TopicClassification()`
model.topic("I went to Spain with my mom in December.")  # Or `model.predict`

{'label': ['diaries_&_daily_life', 'travel_&_adventure'],
 'probability': {'arts_&_culture': 0.07455697655677795,
  'business_&_entrepreneurs': 0.01791427470743656,
  'celebrity_&_pop_culture': 0.024104885756969452,
  'diaries_&_daily_life': 0.653227686882019,
  'family': 0.1428426057100296,
  'fashion_&_style': 0.018940361216664314,
  'film_tv_&_video': 0.033042386174201965,
  'fitness_&_health': 0.027939127758145332,
  'food_&_dining': 0.0473339818418026,
  'gaming': 0.01866302080452442,
  'learning_&_educational': 0.032467249780893326,
  'music': 0.029539255425333977,
  'news_&_social_concern': 0.0279399361461401,
  'other_hobbies': 0.02932649292051792,
  'relationships': 0.0952962189912796,
  'science_&_technology': 0.023787589743733406,
  'sports': 0.020187407732009888,
  'travel_&_adventure': 0.8862668871879578,
  'youth_&_student_life': 0.029908694326877594}}

Test for Accuracy

In [74]:
#Predict on the testing data
'''
topic_pred = []
tweet_list = topic_txt[topic_txt.columns[0]].values.tolist()
label_list = ["arts_&_culture", "business_&_entrepreneurs", "celebrity_&_pop_culture", "diaries_&_daily_life", "family", "fashion_&_style", "film_tv_&_video", "fitness_&_health", "food_&_dining", "gaming", "learning_&_educational", "music", "news_&_social_concern", "other_hobbies", "relationships", "science_&_technology", "sports", "travel_&_adventure", "youth_&_student_life"]

for tweet in tweet_list:
  output = model.topic(tweet)
  label = output["label"]
  label = label_list.index(label)
  topic_pred.append(label)


#Calculate F1 Measure
topic_F1 = sklearn.metrics.f1_score(y_true=topic_true, y_pred=topic_pred, labels=None, pos_label=1, average='binary')
'''

print("Cannot test accuracy because of a lack of a labeled dataset")

Cannot test accuracy because of a lack of a labeled dataset


### Irony Detection
This is a binary classification task where given a tweet, the goal is to detect whether it is ironic or not. It is based on the Irony Detection dataset from the SemEval 2018 task.

Sample Irony Detection Demonstration

In [61]:
# single input
model = tweetnlp.load('irony')  # Or `model = tweetnlp.Irony()` 
model.irony('Wow I love walking to class when its snowing')  # Or `model.predict`

{'label': 'irony', 'probability': 0.9887392520904541}

Test for Accuracy

In [54]:
#Predict on the testing data
irony_pred = []
tweet_list = irony_txt[irony_txt.columns[0]].values.tolist()

for tweet in tweet_list:
  output = model.irony(tweet)
  label = output["label"]
  if label == "not-irony":
    label = 0
  else:
    label = 1
  irony_pred.append(label)


#Calculate F1 Measure
irony_F1 = sklearn.metrics.f1_score(y_true=irony_true, y_pred=irony_pred, labels=None, pos_label=1, average='binary')

### Offensive Language Identification
This task consists in identifying whether some form of offensive language is present in a tweet. For our benchmark we rely on the SemEval2019 OffensEval dataset.

Sample Offensive Language Identification Demonstration

In [64]:
# single input
model = tweetnlp.load('offensive')  # Or `model = tweetnlp.Offensive()` 
model.offensive("Ohio State fans are losers")  # Or `model.predict`

{'label': 'offensive', 'probability': 0.8413265347480774}

Test for Accuracy

In [57]:
#Predict on the testing data
offensive_pred = []
tweet_list = offensive_txt[offensive_txt.columns[0]].values.tolist()

for tweet in tweet_list:
  output = model.offensive(tweet)
  label = output["label"]
  if label == "not-offensive":
    label = 0
  else:
    label = 1
  offensive_pred.append(label)

offensive_F1 = sklearn.metrics.f1_score(y_true=offensive_true, y_pred=offensive_pred, labels=None, pos_label=1, average='binary')

### Emotion Recognition
Given a tweet, this task consists of associating it with its most appropriate emotion. As a reference dataset we use the SemEval 2018 task on Affect in Tweets, simplified to only four emotions used in TweetEval: anger, joy, sadness and optimism.

Sample Emotion Recognition Demonstration

In [75]:
# single input
model = tweetnlp.load('emotion')  # Or `model = tweetnlp.Emotion()` 
model.emotion('I cant wait to meet my new puppy')  # Or `model.predict`

{'label': 'joy', 'probability': 0.9166305661201477}

Test for Accuracy

In [59]:
#Predict on the testing data
emotions_pred = []
tweet_list = emotions_txt[emotions_txt.columns[0]].values.tolist()

for tweet in tweet_list:
  output = model.emotion(tweet)
  label = output["label"]
  if label == "anger":
    label = 0
  elif label == "joy":
    label = 1
  elif label == "optimism":
    label = 2
  else:
    label = 3
  emotions_pred.append(label)

emotions_F1 = sklearn.metrics.f1_score(y_true=emotions_true, y_pred=emotions_pred, labels=None, pos_label=1, average=None)


# Compare Results

In [73]:
print(f"Emotions: {sum(emotions_F1)/4}, \nIrony: {irony_F1}, \nOffensive: {offensive_F1}")

Emotions: 0.803827092044503, 
Irony: 0.5680365296803653, 
Offensive: 0.7239819004524886
