# Sentiment Analysis - Multilayer Perceptron

The Multilayer Perceptron (**MLP**) is considered one of the most basic building blocks for netural networks. While the simple Perceptron takes data vector as input and computes a single output value, MLP groups many perceptrons, so the output of the single layer is a new vector instead of a single output value.

In PyTorch, this is done simply by setting the number of output features in the <code>Linear</code> layer. Additionally, in MLP multiple layers are combined with a nonlinearity between each layer.

<img src="files/mlp.png" width="600" height="300" align="center"/>

When it comes to the **Sentiment Analysis task** we are solving, everything except the model itself, stays the same as in previous example: building the dataset, vectorizer, vocabulary, data loader, training loop and the evaluation. We are going to use the one-hot encoding to represent the text of the tweet, as in previous example.

## Setup

Firstly, set up the path to the (preprocessed) dataset

In [1]:
# Path to the preprocessed data
import os

fileDir = os.path.dirname(os.path.realpath('__file__'))
absFilePathToPreprocessedDataset = os.path.join(fileDir, '../Data/training.1600000.processed.noemoticon_preprocessed.csv')
pathToPreprocessedDataset = os.path.abspath(os.path.realpath(absFilePathToPreprocessedDataset))
print (pathToPreprocessedDataset)

c:\Users\v-tastan\source\repos\PetnicaNLPWorkshop\Data\training.1600000.processed.noemoticon_preprocessed.csv


Choose the device to run the training on:

In [2]:
device = "cpu"

Set the learning rate parameter:

In [3]:
learningRate = 0.001

Set the size of the hidden layer for the MLP model:

In [5]:
hidden_dim = 100

## Initialization

In [4]:
import torch.nn as nn
import torch.optim as optim
from TwitterDataset import TwitterDataset
from ModelMLP import SentimentClassifierMLP

# Step #1: Instantiate the dataset
# instantiate the dataset
dataset = TwitterDataset.load_dataset_and_make_vectorizer(pathToPreprocessedDataset)
# get the vectorizer
vectorizer = dataset.get_vectorizer()

# Step #2: Instantiate the model
# instantiate the model
model = SentimentClassifierMLP(input_dim=len(vectorizer.text_vocabulary), hidden_dim=hidden_dim, output_dim=len(vectorizer.target_vocabulary))
# send model to appropriate device
model = model.to(device)

# Step #3: Instantiate the loss function
loss_func = nn.CrossEntropyLoss()

# Step #4: Instantiate the optimizer
optimizer = optim.Adam(model.parameters(), lr=learningRate)

## Training Loop

In [6]:
from Trainer import Trainer

sentiment_analysis_trainer = Trainer(
    dataset=dataset,
    model=model,
    loss_func=loss_func,
    optimizer=optimizer
)

In [7]:
# setup the chosen number of epochs
num_epochs = 200
# setup the chosen batch size
batch_size = 16

report = sentiment_analysis_trainer.train(num_epochs=num_epochs, batch_size=batch_size, device=device)

## Evaluate the results

In [8]:
def evaluate(split):
    loss, accuracy = sentiment_analysis_trainer.evaluate(split=split, device=device, batch_size=batch_size)

    print("Loss: {:.3f}".format(loss))
    print("Accuracy: {:.3f}".format(accuracy))

#### Training Set

In [9]:
evaluate(split="train")

Loss: 0.030
Accuracy: 0.985


#### Validation Set

In [10]:
evaluate(split="validation")

Loss: 3.601
Accuracy: 0.635


#### Test Set

In [11]:
evaluate(split="test")

Loss: 2.807
Accuracy: 0.562


### More detailed evaluation on the Test Set

In [17]:
from sklearn.metrics import classification_report, confusion_matrix

# run the model on the tweets from test set 
y_predicted = dataset.test_df.text.apply(lambda x: predict(text=x, model=model, vectorizer=vectorizer)[0])

# compare that with labels
print(classification_report(y_true=dataset.test_df.target, y_pred=y_predicted))

# plot confusion matrix
print("Consfusion matrix:")
print(confusion_matrix(y_true=dataset.test_df.target, y_pred=y_predicted))

precision    recall  f1-score   support

         0.0       0.57      0.62      0.59        47
         1.0       0.63      0.58      0.61        53

    accuracy                           0.60       100
   macro avg       0.60      0.60      0.60       100
weighted avg       0.60      0.60      0.60       100

Consfusion matrix:
[[29 18]
 [22 31]]


## Inference and classifying new data points

Let's do inference on the new data. This is another evaluation method to make qualitative judgement about whether the model is working.

In [13]:
import torch

def predict(text, model, vectorizer):
    """
    Predict the sentiment of the tweet

    Args:
        text (str): the text of the tweet
        model (SentimentClassifierPerceptron): the trained model
        vectorizer (TwitterVectorizer): the corresponding vectorizer
    Returns:
        sentiment of the tweet (int), probability of that prediction (float)
    """
    # vectorize the text of the tweet
    vectorized_text = vectorizer.vectorize(text)

    # make a tensor with expected size (1, )
    vectorized_text = torch.Tensor(vectorized_text).view(1, -1)

    # run the model on the vectorized text and apply softmax activation function on the outputs
    result = model(vectorized_text, apply_softmax=True)

    # find the best class as the one with the highest probability
    probability_values, indices = result.max(dim=1)

    # take only value of the indices tensor
    index = indices.item()

    # decode the predicted target index into the sentiment, using target vocabulary
    predicted_target = vectorizer.target_vocabulary.find_index(index)

    # take only value of the probability_values tensor 
    probability_value = probability_values.item()

    return predicted_target, probability_value

Let's try the model on some examples:

In [14]:
text = "This is a good day."

predict(text, model, vectorizer)

(1.0, 0.9999960660934448)

In [15]:
text = "I was very sad yesterday."

predict(text, model, vectorizer)

(0.0, 1.0)

In [16]:
text = "This is a book."

predict(text, model, vectorizer)

(0.0, 0.9160420894622803)

## Inspecting model weights

In [18]:
fc1_weights = model.fc1.weight.detach()[0]

fc1_weights

tensor([-0.1231,  0.2401,  0.1812,  0.1450,  0.2999, -0.0496,  0.0606, -0.1524,
        -0.0403, -0.2370, -0.1232, -0.2028, -0.1009,  0.1240,  0.1100,  0.1449,
        -0.1224,  0.1463,  0.0079,  0.1855,  0.1215, -0.0800, -0.1753, -0.1551,
         0.0904, -0.1401, -0.1256,  0.1441, -0.3846,  0.0922, -0.1228,  0.0410,
        -0.1010,  0.2857,  0.1378, -0.0380,  0.0135,  0.2073,  0.1644,  0.3477,
         0.0840,  0.3954, -0.0077,  0.1322, -0.0057,  0.0275, -0.2876, -0.0634,
        -0.1748,  0.1780, -0.0715,  0.1252,  0.0127, -0.1594, -0.0555,  0.1680,
         0.1182, -0.1301,  0.2109,  0.0600,  0.2014, -0.2095,  0.0212, -0.1024,
        -0.2291,  0.2514, -0.0104,  0.1280,  0.3569,  0.0088, -0.0157,  0.3473,
        -0.1694,  0.2535, -0.2317,  0.5992,  0.0851, -0.0855, -0.4178,  0.4111,
         0.2885,  0.1157,  0.1052,  0.1233,  0.1112, -0.1665, -0.0203, -0.1434,
         0.0453, -0.2255,  0.0025, -0.0549, -0.1113,  0.1339, -0.0691,  0.0537,
        -0.2077,  0.1635,  0.2467,  0.40

In [19]:
import torch

_, indices = torch.sort(fc1_weights, dim=0, descending=True)

indices = indices.numpy().tolist()

#### Top 20 most infuential words in Negative Tweets

In [20]:
for i in range(20):
    print(vectorizer.text_vocabulary.find_index(indices[i]))

as
sad
little
his
again
can't
still
please
didn't
over
hope
do
off
something
with
am
never
she
why
lol


#### Top 20 most infuential words in Positive Tweets

In [21]:
indices.reverse()

for i in range(20):
    print(vectorizer.text_vocabulary.find_index(indices[i]))

about
back
u
i've
you're
should
can
more
love
night
say
need
sure
watching
..
but
big
better
feel
yeah
