# Sentiment Analysis Project:

In this project I have done sentiment analysis from Stanford Sentiment Dataset https://nlp.stanford.edu/sentiment/treebank.html

Here are the 

- Used pytreebank which is used for preprocessing the dataset
- Used glove embedding
- Used goooge translate for data augmentation
- Three data augmentation used are : backtranslation, random deletion, random swap
- Used two layer LSTM
- Used dropout

## Dataset Preview

Your first step to deep learning in NLP. We will be mostly using PyTorch. Just like torchvision, PyTorch provides an official library, torchtext, for handling text-processing pipelines. 

We will be using previous session tweet dataset. Let's just preview the dataset.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!pip install google_trans_new

Collecting google_trans_new
  Downloading https://files.pythonhosted.org/packages/f9/7b/9f136106dc5824dc98185c97991d3cd9b53e70a197154dd49f7b899128f6/google_trans_new-1.1.9-py3-none-any.whl
Installing collected packages: google-trans-new
Successfully installed google-trans-new-1.1.9


## Define functions for data augmentation

In [3]:
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import stopwords
import pandas as pd
import string
import nltk
import random
import random
import google_trans_new
from google_trans_new import google_translator

nltk.download('stopwords')
nltk.download('wordnet')
from nltk.corpus import stopwords
lem = WordNetLemmatizer()
translator = google_translator()

def clean_text(text):
    ## lower case
    if not isinstance(text, str):
      return str(text) 
    cleaned = text.lower()
    
    ## remove punctuations
    punctuations = string.punctuation
    cleaned_temp = "".join(character for character in cleaned if character not in punctuations)
    
    ## remove stopwords 
    words = cleaned_temp.split()
    #stopword_lists = stopwords.words("english")
    #cleaned = [word for word in words if word not in stopword_lists]
    cleaned = words
    
    ## normalization - lemmatization
    cleaned = [lem.lemmatize(word, "v") for word in cleaned]
    cleaned = [lem.lemmatize(word, "n") for word in cleaned]
    
    ## join 
    cleaned = " ".join(cleaned)
    return cleaned

def back_translate(sentence):
  
  available_langs = list(google_trans_new.LANGUAGES.keys()) 
  trans_lang = random.choice(available_langs) 
  #print(f"Translating to {google_trans_new.LANGUAGES[trans_lang]}")
  translations = translator.translate(sentence, lang_tgt=trans_lang) 
  #print(translations)

  translations_en_random = translator.translate(translations, lang_src=trans_lang, lang_tgt='en') 
  # print(translations_en_random)
  return translations_en_random

def random_deletion(sentence, p=0.5): 
    words = sentence.split()
    ret_val = ""
    if len(words) == 1: # return if single word
        ret_val = words
    remaining = list(filter(lambda x: random.uniform(0,1) > p,words)) 
    if len(remaining) == 0: # if not left, sample a random word
        ret_val = [random.choice(words)] 
    else:
        ret_val = remaining
    return " ".join(ret_val)

def random_swap(sentence, n=5):
    words =  sentence.split()
    length = range(len(words)) 
    for _ in range(n):
        idx1, idx2 = random.sample(length, 2)
        words[idx1], words[idx2] = words[idx2], words[idx1] 
    return " ".join(words)



[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


In [4]:
!pip install pytreebank

Collecting pytreebank
  Downloading https://files.pythonhosted.org/packages/e0/12/626ead6f6c0a0a9617396796b965961e9dfa5e78b36c17a81ea4c43554b1/pytreebank-0.2.7.tar.gz
Building wheels for collected packages: pytreebank
  Building wheel for pytreebank (setup.py) ... [?25l[?25hdone
  Created wheel for pytreebank: filename=pytreebank-0.2.7-cp36-none-any.whl size=37070 sha256=9be60a3eeda2140e9547306250bc0024b6032a7a6e5f623f6a934d8661f484a3
  Stored in directory: /root/.cache/pip/wheels/e0/b6/91/e9edcdbf464f623628d5c3aa9de28888c726e270b9a29f2368
Successfully built pytreebank
Installing collected packages: pytreebank
Successfully installed pytreebank-0.2.7


In [5]:
import pytreebank
import sys 
import os

out_path = 'sst_{}.txt'
dataset = pytreebank.load_sst()
for category in ['train', 'test', 'dev']:
  with open(out_path.format(category),'w' ) as outfile:
    for item in dataset[category]:
      outfile.write("__label__{}\t{}\n".format(item.to_labeled_lines()[0][0] + 1, item.to_labeled_lines()[0][1] ))


print(len(dataset['train']))

8544


In [6]:
import pandas as pd

train_df = pd.read_csv('sst_train.txt', sep='\t', header=None, names=['label', 'text'])
train_df['label'] = train_df['label'].str.replace('__label__', '')
train_df['label'] = train_df['label'].astype('int').astype('category')
train_df.head()

Unnamed: 0,label,text
0,4,The Rock is destined to be the 21st Century 's...
1,5,The gorgeously elaborate continuation of `` Th...
2,4,Singer/composer Bryan Adams contributes a slew...
3,3,You 'd think by now America would have had eno...
4,4,Yet the act is still charming here .


## Augmentation Strategies

## Random Deletion

As the name suggests, random deletion deletes words from a sentence. Given a probability parameter p, it will go through the sentence and decide whether to delete a word or not based on that random probability. Consider of it as pixel dropouts while treating images.

### Random Swap
The random swap augmentation takes a sentence and then swaps words within it n times, with each iteration working on the previously swapped sentence. Here we sample two random numbers based on the length of the sentence, and then just keep swapping until we hit n.

### Back Translation

Another popular approach for augmenting text datasets is back translation. This involves translating a sentence from our target language into one or more other languages and then translating all of them back to the original language. We can use the Python library googletrans for this purpose. 

In [7]:
train_augment_df = pd.DataFrame(columns=['label', 'text'])
for index, row in train_df.iterrows():  
  train_augment_df = train_augment_df.append({'label':row['label'], 'text': clean_text(random_deletion(row['text'])) }, ignore_index=True)
  train_augment_df = train_augment_df.append({'label':row['label'], 'text':  clean_text(random_swap(row['text']))}, ignore_index=True ) 
  if random.random() <= 0.2: 
    train_augment_df = train_augment_df.append({'label':row['label'], 'text':  clean_text(back_translate(row['text'])) }, ignore_index=True) 


In [8]:
train_augment_df.head()

Unnamed: 0,label,text
0,4,rock destine 21st century s that splash arnold...
1,4,be rock be destine to the the 21st s new conan...
2,5,the gorgeously of the ring trilogy huge that a...
3,5,the gorgeously elaborate continuation of be th...
4,5,the gorgeous expansion of the lord of the ring...


In [9]:
train_df["text"] = train_df["text"].apply(lambda x : clean_text(x))

In [10]:
train_df_concat = pd.concat([train_df, train_augment_df],axis=0)

In [11]:
train_df_concat.head()

Unnamed: 0,label,text
0,4,the rock be destine to be the 21st century s n...
1,5,the gorgeously elaborate continuation of the l...
2,4,singercomposer bryan adam contribute a slew of...
3,3,you d think by now america would have have eno...
4,4,yet the act be still charm here


In [12]:
train_df = train_df_concat

In [13]:
train_df.shape

(27381, 2)

In [14]:
train_df.label.value_counts()

4    7469
2    7109
3    5180
5    4126
1    3497
Name: label, dtype: int64

In [15]:
dev_df = pd.read_csv('sst_dev.txt', sep='\t', header=None, names=['label', 'text'])
dev_df['label'] = dev_df['label'].str.replace('__label__', '')
dev_df['label'] = dev_df['label'].astype('int').astype('category')
dev_df.head()

Unnamed: 0,label,text
0,4,it s a lovely film with lovely performance by ...
1,3,no one go unindicted here which be probably fo...
2,4,and if you re not nearly move to tear by a cou...
3,5,a warm funny engage film
4,5,use sharp humor and insight into human nature ...


In [16]:
test_df = pd.read_csv('sst_test.txt', sep='\t', header=None, names=['label', 'text'])
test_df['label'] = test_df['label'].str.replace('__label__', '')
test_df['label'] = test_df['label'].astype('int').astype('category')
test_df.head()

Unnamed: 0,label,text
0,3,effective but tootepid biopic
1,4,if you sometimes like to go to the movie to ha...
2,5,emerge a something rare an issue movie that s ...
3,3,the film provide some great insight into the n...
4,5,offer that rare combination of entertainment a...


## Defining Fields

Now we shall be defining LABEL as a LabelField, which is a subclass of Field that sets sequen tial to False (as it’s our numerical category class). TWEET is a standard Field object, where we have decided to use the spaCy tokenizer and convert all the text to lower‐ case.

In [17]:
# Import Library
import random
import torch, torchtext
from torchtext import data 

# Manual Seed
SEED = 43
torch.manual_seed(SEED)

<torch._C.Generator at 0x7f6732a279d8>

In [18]:
Text = data.Field(sequential = True, tokenize = 'spacy', batch_first =True, include_lengths=True)
Label = data.LabelField(tokenize ='spacy', is_target=True, batch_first =True, sequential =False)

Having defined those fields, we now need to produce a list that maps them onto the list of rows that are in the CSV:

In [20]:
fields = [('text', Text),('label',Label)]

In [22]:
train_df.shape[0]

27381

Armed with our declared fields, lets convert from pandas to list to torchtext. We could also use TabularDataset to apply that definition to the CSV directly but showing an alternative approach too.

In [53]:
train_field_value_list = list(zip(train_df['text'], train_df['label']))

In [54]:
train_example = [data.Example.fromlist([text_label[0], text_label[1]], fields ) for text_label in train_field_value_list if len(text_label[0]) > 0 ] 

In [55]:
print(train_example)

[<torchtext.data.example.Example object at 0x7f667e59a7b8>, <torchtext.data.example.Example object at 0x7f667e59a048>, <torchtext.data.example.Example object at 0x7f667e59aac8>, <torchtext.data.example.Example object at 0x7f668154dc88>, <torchtext.data.example.Example object at 0x7f666cf31198>, <torchtext.data.example.Example object at 0x7f6677181f98>, <torchtext.data.example.Example object at 0x7f6675900208>, <torchtext.data.example.Example object at 0x7f667ea1e048>, <torchtext.data.example.Example object at 0x7f667ea1e710>, <torchtext.data.example.Example object at 0x7f666cedecc0>, <torchtext.data.example.Example object at 0x7f666cede240>, <torchtext.data.example.Example object at 0x7f666cede400>, <torchtext.data.example.Example object at 0x7f666cedefd0>, <torchtext.data.example.Example object at 0x7f666cede860>, <torchtext.data.example.Example object at 0x7f666cede080>, <torchtext.data.example.Example object at 0x7f666cede748>, <torchtext.data.example.Example object at 0x7f667f67a43

In [56]:
dev_example = [data.Example.fromlist([dev_df.text[i],dev_df.label[i]], fields) for i in range(dev_df.shape[0])] 
test_example = [data.Example.fromlist([test_df.text[i],test_df.label[i]], fields) for i in range(test_df.shape[0])] 

In [57]:
# Creating dataset
#twitterDataset = data.TabularDataset(path="tweets.csv", format="CSV", fields=fields, skip_header=True)
train = data.Dataset(train_example, fields)
valid = data.Dataset(dev_example, fields)
test = data.Dataset(test_example, fields)

In [58]:
(len(train), len(valid), len(test))

(27314, 1101, 2210)

An example from the dataset:

In [59]:
vars(train.examples[10])

{'label': 5,
 'text': ['good',
  'fun',
  'good',
  'action',
  'good',
  'act',
  'good',
  'dialogue',
  'good',
  'pace',
  'good',
  'cinematography']}

## Building Vocabulary

At this point we would have built a one-hot encoding of each word that is present in the dataset—a rather tedious process. Thankfully, torchtext will do this for us, and will also allow a max_size parameter to be passed in to limit the vocabu‐ lary to the most common words. This is normally done to prevent the construction of a huge, memory-hungry model. We don’t want our GPUs too overwhelmed, after all. 

Let’s limit the vocabulary to a maximum of 5000 words in our training set:


In [60]:
MAX_VOCAB_SIZE = 25_000

Text.build_vocab(train, 
                 max_size = MAX_VOCAB_SIZE, 
                 #vectors = "glove.6B.100d", 
                 vectors = "glove.840B.300d",
                 unk_init = torch.Tensor.normal_)

#Text.build_vocab(train)
Label.build_vocab(train)

By default, torchtext will add two more special tokens, <unk> for unknown words and <pad>, a padding token that will be used to pad all our text to roughly the same size to help with efficient batching on the GPU.

In [61]:
print('Size of input vocab : ', len(Text.vocab))
print('Size of label vocab : ', len(Label.vocab))
print('Top 10 words appreared repeatedly :', list(Text.vocab.freqs.most_common(10)))
print('Labels : ', Label.vocab.stoi)

Size of input vocab :  13982
Size of label vocab :  5
Top 10 words appreared repeatedly : [('the', 19893), ('a', 17733), ('and', 12142), ('be', 11999), ('of', 11945), ('it', 9051), ('to', 8135), ('s', 6386), ('that', 5251), ('in', 5126)]
Labels :  defaultdict(<function _default_unk_index at 0x7f66d67d90d0>, {4: 0, 2: 1, 3: 2, 5: 3, 1: 4})


**Lots of stopwords!!**

Now we need to create a data loader to feed into our training loop. Torchtext provides the BucketIterator method that will produce what it calls a Batch, which is almost, but not quite, like the data loader we used on images.

But at first declare the device we are using.

In [62]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [63]:
train_iterator, valid_iterator = data.BucketIterator.splits((train, valid), batch_size = 32, 
                                                            sort_key = lambda x: len(x.text),
                                                            sort_within_batch=True, device = device)

Save the vocabulary for later use

In [64]:
import os, pickle
with open('tokenizer.pkl', 'wb') as tokens: 
    pickle.dump(Text.vocab.stoi, tokens)

## Defining Our Model

We use the Embedding and LSTM modules in PyTorch to build a simple model for classifying tweets.

In this model we create three layers. 
1. First, the words in our tweets are pushed into an Embedding layer, which we have established as a 300-dimensional vector embedding. 
2. That’s then fed into a 2 stacked-LSTMs with 100 hidden features (again, we’re compressing down from the 300-dimensional input like we did with images). We are using 2 LSTMs for using the dropout.
3. Finally, the output of the LSTM (the final hidden state after processing the incoming tweet) is pushed through a standard fully connected layer with three outputs to correspond to our three possible classes (negative, positive, or neutral).

In [91]:
import torch.nn as nn
import torch.nn.functional as F

class classifier(nn.Module):
    
    # Define all the layers used in model
    def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim, n_layers, dropout, pad_idx ):
        
        super().__init__()          
        
        # Embedding layer
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx = pad_idx)
        
        # LSTM layer
        self.encoder = nn.LSTM(embedding_dim, 
                           hidden_dim, 
                           num_layers=n_layers, 
                           dropout=dropout,
                           batch_first=True)
        # try using nn.GRU or nn.RNN here and compare their performances
        # try bidirectional and compare their performances
        
        # Dense layer
        self.fc = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, text, text_lengths):
        
        # text = [batch size, sent_length]
        embedded = self.embedding(text)
        # embedded = [batch size, sent_len, emb dim]
      
        # packed sequence
        packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths.cpu(), batch_first=True)
        
        packed_output, (hidden, cell) = self.encoder(packed_embedded)
        #hidden = [batch size, num layers * num directions,hid dim]
        #cell = [batch size, num layers * num directions,hid dim]
    
        # Hidden = [batch size, hid dim * num directions]
        dense_outputs = self.fc(hidden)   
        
        # Final activation function softmax
        output = F.softmax(dense_outputs[0], dim=1)
            
        return output

In [92]:
# Define hyperparameters
size_of_vocab = len(Text.vocab)
#embedding_dim = 100
embedding_dim = 300
#num_hidden_nodes = 100
num_hidden_nodes = 256
num_output_nodes = 5
#num_layers = 3
num_layers = 2
#dropout = 0.2
dropout = 0.35

PAD_IDX = Text.vocab.stoi[Text.pad_token]


# Instantiate the model
model = classifier(size_of_vocab, embedding_dim, num_hidden_nodes, num_output_nodes, num_layers, dropout = dropout, pad_idx=PAD_IDX)

In [93]:
print(size_of_vocab)

13982


In [94]:
print(model)

#No. of trianable parameters
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)
    
print(f'The model has {count_parameters(model):,} trainable parameters')

classifier(
  (embedding): Embedding(13982, 300, padding_idx=1)
  (encoder): LSTM(300, 256, num_layers=2, batch_first=True, dropout=0.35)
  (fc): Linear(in_features=256, out_features=5, bias=True)
)
The model has 5,293,613 trainable parameters


In [95]:
pretrained_embeddings = Text.vocab.vectors
print(pretrained_embeddings.shape)



torch.Size([13982, 300])


In [96]:
model.embedding.weight.data.copy_(pretrained_embeddings)

tensor([[ 1.0944e+00, -6.3519e-01, -3.4619e-01,  ..., -8.9674e-01,
          3.5588e-01,  3.6185e-02],
        [-6.4213e-02, -1.1156e-01,  1.2462e-01,  ...,  2.9916e-01,
          4.7750e-01,  6.8963e-01],
        [ 2.7204e-01, -6.2030e-02, -1.8840e-01,  ...,  1.3015e-01,
         -1.8317e-01,  1.3230e-01],
        ...,
        [ 1.5684e+00,  2.7859e+00,  8.7370e-01,  ..., -1.9300e+00,
         -1.3648e+00, -4.9309e-01],
        [ 9.8665e-01, -2.9219e-01,  1.1042e-01,  ...,  2.1543e-01,
          3.0969e-01, -1.6442e+00],
        [ 1.6314e-03, -6.5206e-02, -2.8048e-01,  ...,  1.2960e-02,
          7.0722e-04, -6.6269e-02]])

In [97]:
UNK_IDX = Text.vocab.stoi[Text.unk_token]

model.embedding.weight.data[UNK_IDX] = torch.zeros(embedding_dim)
model.embedding.weight.data[PAD_IDX] = torch.zeros(embedding_dim)

print(model.embedding.weight.data)

tensor([[ 0.0000e+00,  0.0000e+00,  0.0000e+00,  ...,  0.0000e+00,
          0.0000e+00,  0.0000e+00],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  ...,  0.0000e+00,
          0.0000e+00,  0.0000e+00],
        [ 2.7204e-01, -6.2030e-02, -1.8840e-01,  ...,  1.3015e-01,
         -1.8317e-01,  1.3230e-01],
        ...,
        [ 1.5684e+00,  2.7859e+00,  8.7370e-01,  ..., -1.9300e+00,
         -1.3648e+00, -4.9309e-01],
        [ 9.8665e-01, -2.9219e-01,  1.1042e-01,  ...,  2.1543e-01,
          3.0969e-01, -1.6442e+00],
        [ 1.6314e-03, -6.5206e-02, -2.8048e-01,  ...,  1.2960e-02,
          7.0722e-04, -6.6269e-02]])


In [98]:
dir(model.embedding)

['T_destination',
 '__annotations__',
 '__call__',
 '__class__',
 '__constants__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_apply',
 '_backward_hooks',
 '_buffers',
 '_call_impl',
 '_forward_hooks',
 '_forward_pre_hooks',
 '_get_name',
 '_load_from_state_dict',
 '_load_state_dict_pre_hooks',
 '_modules',
 '_named_members',
 '_non_persistent_buffers_set',
 '_parameters',
 '_register_load_state_dict_pre_hook',
 '_register_state_dict_hook',
 '_replicate_for_data_parallel',
 '_save_to_state_dict',
 '_slow_forward',
 '_state_dict_hooks',
 '_version',
 'add_module',
 'apply',
 'bfloat16',
 'buffers',
 'children',
 'cpu',
 'cuda',
 'double',
 'dump

## Model Training and Evaluation

First define the optimizer and loss functions

In [99]:
import torch.optim as optim

# define optimizer and loss
optimizer = optim.Adam(model.parameters(), lr=2e-4)
criterion = nn.CrossEntropyLoss()

# define metric
def binary_accuracy(preds, y):
    #round predictions to the closest integer
    _, predictions = torch.max(preds, 1)
    
    correct = (predictions == y).float() 
    acc = correct.sum() / len(correct)
    return acc
    
# push to cuda if available
model = model.to(device)
criterion = criterion.to(device)

The main thing to be aware of in this new training loop is that we have to reference `batch.tweets` and `batch.labels` to get the particular fields we’re interested in; they don’t fall out quite as nicely from the enumerator as they do in torchvision.

**Training Loop**

In [100]:
def train(model, iterator, optimizer, criterion):
    
    # initialize every epoch 
    epoch_loss = 0
    epoch_acc = 0
    
    # set the model in training phase
    model.train()  
    
    for batch in iterator:
        
        # resets the gradients after every batch
        optimizer.zero_grad()   
        
        # retrieve text and no. of words
        text, text_lengths = batch.text 
        
        # convert to 1D tensor
        predictions = model(text, text_lengths).squeeze()  
        
        # compute the loss
        loss = criterion(predictions, batch.label)        
        
        # compute the binary accuracy
        acc = binary_accuracy(predictions, batch.label)   
        
        # backpropage the loss and compute the gradients
        loss.backward()       
        
        # update the weights
        optimizer.step()      
        
        # loss and accuracy
        epoch_loss += loss.item()  
        epoch_acc += acc.item()    
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

**Evaluation Loop**

In [101]:
def evaluate(model, iterator, criterion):
    
    # initialize every epoch
    epoch_loss = 0
    epoch_acc = 0

    # deactivating dropout layers
    model.eval()
    
    # deactivates autograd
    with torch.no_grad():
    
        for batch in iterator:
        
            # retrieve text and no. of words
            text, text_lengths = batch.text
            
            # convert to 1d tensor
            predictions = model(text, text_lengths).squeeze()
            
            # compute loss and accuracy
            loss = criterion(predictions, batch.label)
            acc = binary_accuracy(predictions, batch.label)
            
            # keep track of loss and accuracy
            epoch_loss += loss.item()
            epoch_acc += acc.item()
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

**Let's Train and Evaluate**

In [102]:
N_EPOCHS = 25
best_valid_acc = float('-inf')
MODEL_PATH='/content/drive/My Drive/stanford_sentiment_analysis.pth'

for epoch in range(N_EPOCHS):
     
    # train the model
    train_loss, train_acc = train(model, train_iterator, optimizer, criterion)
    
    # evaluate the model
    valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)
    
    # save the best model
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        torch.save(model.state_dict(), MODEL_PATH)
    print(f'\tEpoch: {epoch + 1}')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}% \n')

print(f'Best Validation Accuracy: {best_valid_acc}')

	Epoch: 1
	Train Loss: 1.518 | Train Acc: 37.24%
	 Val. Loss: 1.479 |  Val. Acc: 42.31% 

	Epoch: 2
	Train Loss: 1.440 | Train Acc: 45.69%
	 Val. Loss: 1.477 |  Val. Acc: 40.79% 

	Epoch: 3
	Train Loss: 1.371 | Train Acc: 53.44%
	 Val. Loss: 1.473 |  Val. Acc: 41.50% 

	Epoch: 4
	Train Loss: 1.314 | Train Acc: 59.76%
	 Val. Loss: 1.477 |  Val. Acc: 41.69% 

	Epoch: 5
	Train Loss: 1.254 | Train Acc: 65.76%
	 Val. Loss: 1.473 |  Val. Acc: 41.82% 

	Epoch: 6
	Train Loss: 1.209 | Train Acc: 70.16%
	 Val. Loss: 1.488 |  Val. Acc: 39.73% 

	Epoch: 7
	Train Loss: 1.174 | Train Acc: 73.69%
	 Val. Loss: 1.489 |  Val. Acc: 39.95% 

	Epoch: 8
	Train Loss: 1.147 | Train Acc: 76.26%
	 Val. Loss: 1.503 |  Val. Acc: 38.96% 

	Epoch: 9
	Train Loss: 1.124 | Train Acc: 78.46%
	 Val. Loss: 1.500 |  Val. Acc: 38.69% 

	Epoch: 10
	Train Loss: 1.107 | Train Acc: 80.21%
	 Val. Loss: 1.503 |  Val. Acc: 39.14% 

	Epoch: 11
	Train Loss: 1.092 | Train Acc: 81.57%
	 Val. Loss: 1.499 |  Val. Acc: 39.18% 

	Epoch: 

## Model Testing

In [103]:
#load weights and tokenizer
model.load_state_dict(torch.load(MODEL_PATH));
model.eval();
tokenizer_file = open('./tokenizer.pkl', 'rb')
tokenizer = pickle.load(tokenizer_file)

#inference 

import spacy
nlp = spacy.load('en')

def classify_tweet(tweet):
    
    categories = {1: "Very Negative", 2:"Negative", 3:"Neutral", 4: "Positive", 5: "Very Positive"}
    
    # tokenize the tweet 
    tokenized = [tok.text for tok in nlp.tokenizer(tweet)] 
    # convert to integer sequence using predefined tokenizer dictionary
    indexed = [tokenizer[t] for t in tokenized]        
    # compute no. of words        
    length = [len(indexed)]
    # convert to tensor                                    
    tensor = torch.LongTensor(indexed).to(device)   
    # reshape in form of batch, no. of words           
    tensor = tensor.unsqueeze(1).T  
    # convert to tensor                          
    length_tensor = torch.LongTensor(length)
    # Get the model prediction                  
    prediction = model(tensor, length_tensor)

    _, pred = torch.max(prediction, 1) 
    
    return categories[pred.item()]

In [104]:
classify_tweet("A valid explanation for why Trump won't let women on the golf course.")

'Very Negative'