# SemEval-2018 Task 1: Affect in Tweets (AIT-2018)

## An emotion intensity ordinal classification task

*Joana Ferreira |
joanaferreira0011@gmail.com |
Faculty of Engineering, University of Porto, R. Dr. Roberto Frias, 4200-465 Porto, Portugal *

### Abstract:
This notebook presents a solution for the SemEval-2018 Task 1: Affect in Tweets (AIT-2018). Given a tweet and an emotion (sadness, anger, fear, joy), the model should output an intensity classification (0, 1, 2, 3). To do this, two different approaches were implemented: one using a knowleadge based preprocessing the other using word embeddings (BERT). Finally, several models were used and compared to do the classification.


### 1. Introduction
Given a tweet and an emotion (sadness, anger, fear or joy), the proposed task is to classify it in one of four classes given its intensity: (0:no,
1: low, 2: moderate, 3: high). The task was solved only for the English language. 

This task was proposed in (Mohammad et al.,2018).

### 2. Decription of the dataset
There are four training and test sets of labeled data – one for each emotion. 

The data creation is described in (Mohammad and Kiritchenko, 2018).

In [None]:
import pandas as pd

# Importing the dataset
#dataset = pd.read_csv('../datasets/2018-EI-oc-En-fear-dev.txt', sep="\t", header=None, skiprows=1)


li = []
all_files= {'../datasets/2018-EI-oc-En-joy-dev.txt', '/home/joana/feup/iart/el-oc-tweets/datasets/EI-oc-En-joy-train.txt'}

for filename in all_files:
    df = pd.read_csv(filename, sep="\t", header=None, skiprows=1)
    li.append(df)

dataset = pd.concat(li, axis=0, ignore_index=True)

dataset.columns = ['date', 'text', 'emotion', 'level']
dataset= dataset.sample(frac=1)
print (dataset)


### 3. Approach 
Two alternatives were implemented for preprocessing: a knowleadge based approach (section 3.1.1) and one using word embeddings (BERT) (section 3.1.2). After the preprocessing, both data were run for the same models: Naive Bayes, SVM, Logistic regression, Multi-layer Perceptron classifier, Perceptron, Decision Tree and Random Forest.

#### 3.1.1. Preprocessing using a knowleadge based approach
In this approach, the following processes were implemented:
* **Tokenization**: tokens were created using Python String split() method. Other tokenization methods were tested such as NLTK word_tokenize(), but no significant improvements were observed.
* **Lower case**: tokens were converted to lower case
* **Lemmatization**: using NLTK WordNetLemmatizer
* **User**: All user mentions were substituted with '@user'
* **Emojis**: All emojis were converted to keywords, using the 'emoji' library
* **TF-IDF**: SkLearn's TfidfVectorizer was used to perform TF-IDF
* **Bag of Words (BoW)**: Bag of words was used but eventually substitute with TF-IDF, because of the better overall accuracy


*Skip to section 3.1.2. and do not run the following 3 cells if you want to see the results from using word embeddings (BERT).*


In [None]:
import re
import nltk
from nltk.corpus import stopwords
import emoji
from nltk.stem import WordNetLemmatizer 
  
lemmatizer = WordNetLemmatizer() 

corpus = []
ps = PorterStemmer()

for index, row in dataset.iterrows():
    tweet = row['text']

    #handle users
    tweet = re.sub('@.*', '@user', tweet) 

    
    
    tweet = tweet.lower().split()
    #tweet = nltk.word_tokenize(tweet)

    # stemming and stop word removal
    tweet = ' '.join([lemmatizer.lemmatize(w) for w in tweet if not w in set(stopwords.words('english'))])
    

    #tweet = nlp(tweet) # run annotation over a sentence
    
    
    #emojis
    tweet = emoji.demojize(tweet)
    
    corpus.append(tweet)

print(corpus)


In [None]:
#TF-IDF
from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vectorizer=TfidfVectorizer(use_idf=True)
tfidf_vectorizer_vectors=tfidf_vectorizer.fit_transform(corpus)

print(tfidf_vectorizer_vectors)

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np


vectorizer = CountVectorizer(max_features = 1500)
#X = vectorizer.fit_transform(corpus).toarray()
X = tfidf_vectorizer_vectors.toarray()
y = []
for index, row in dataset.iterrows():
    y.append(int(row['level'][0]))
    '''
    if(int(row['level'][0])==3):
        y.append(1) 
    elif(int(row['level'][0])>0):
        y.append(1) 
    else:
        y.append(0)
    '''
    
    
y = np.array(y)
#print(vectorizer.get_feature_names())
#print(type(X[0]), y.shape)

#### 3.1.2. Preprocessing using BERT
In this approach, word embedding was used to preprocess the data. For this, BERT was chosen and the library *bert_embedding* was used because of its simplicity. The model was pre-trained in the following dataset: *book_corpus_wiki_en_cased*.

*Skip to section 3.2 and do not run the following 3 cells if you want to see the results from using the knowleadge based approach described in 3.1.1.*


In [None]:
from bert_embedding import BertEmbedding
import re
import nltk
import emoji
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer 
import mxnet as mx


corpus = []

for index, row in dataset.iterrows():
    tweet = row['text']
    corpus.append(tweet)

bert = BertEmbedding(model='bert_24_1024_16', dataset_name='book_corpus_wiki_en_cased')
results = bert(corpus)
print(results)

In [None]:
import numpy as np
averaged = []
for sent in results:
    averaged.append(np.mean(sent[1], axis = 0, dtype=np.float64))

corpus=averaged
X=np.array(corpus)

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np


vectorizer = CountVectorizer(max_features = 1500)
y = []
for index, row in dataset.iterrows():
    y.append(int(row['level'][0]))
    '''
    if(int(row['level'][0])==3):
        y.append(1) 
    elif(int(row['level'][0])>0):
        y.append(1) 
    else:
        y.append(0)
    '''
    
    
y = np.array(y)
#print(vectorizer.get_feature_names())
#print(type(X[0]), y.shape)

### 3.2. Classification
After the preprocessing, the data was run for the following models (All from SkLearn). 
* Naive Bayes 
* Support Vector Machine(SVM)
* Logistic regression
* Multi-layer Perceptron classifier
* Perceptron
* Decision Tree
* Random Forest

To evaluate and compare them, Accuracy, Precision, Recall and F1 were measured.


In [None]:
# Split dataset into training and test sets

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

In [None]:
# Naive Bayes

from sklearn.naive_bayes import GaussianNB

from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score

classifier = GaussianNB()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# SVM

from sklearn.svm import SVC

classifier = SVC()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# Logistic Regression

from sklearn.linear_model import LogisticRegression

classifier = LogisticRegression(max_iter=1000)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# SGDC classifier

from sklearn.linear_model import SGDClassifier

classifier = SGDClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
from sklearn.neural_network import MLPClassifier

classifier = MLPClassifier(random_state=1, max_iter=300)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# Perceptron
from sklearn.linear_model import Perceptron

classifier = Perceptron() 
classifier.fit(X_train, y_train) 
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred)) 
print('Accuracy: ', accuracy_score(y_test, y_pred)) 
print('Precision: ', precision_score(y_test, y_pred, average='weighted')) 
print('Recall: ', recall_score(y_test, y_pred, average='weighted')) 
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# Decision Tree

from sklearn.tree import DecisionTreeClassifier

classifier = DecisionTreeClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
# Random Forest

from sklearn.ensemble import RandomForestClassifier

classifier = RandomForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print('Accuracy: ', accuracy_score(y_test, y_pred))
print('Precision: ', precision_score(y_test, y_pred, average='weighted'))
print('Recall: ', recall_score(y_test, y_pred, average='weighted'))
print('F1: ', f1_score(y_test, y_pred, average='weighted'))

In [None]:
#Test here with your input

import os
import numpy as np

tweet = input("Enter tweet: ")
tweet = re.sub('[^a-zA-Z]', ' ', tweet).split()
tweet = ' '.join([ps.stem(w) for w in tweet])
X = vectorizer.transform([tweet]).toarray()

print(X.shape)
print(X)

print("Sentiment level: ", classifier.predict(X))

### 4. Experimental evaluation

#### 4.1 Sadness

| Stretch/Untouched | ProbDistribution | Accuracy |
| --- | --- | --- |
| Stretched | Gaussian | .843 |


### References

Priban, Pavel & Hercig, Tomáš & Lenc, Ladislav. (2018). UWB at SemEval-2018 Task 1: Emotion Intensity Detection in Tweets. 133-140. 10.18653/v1/S18-1018.