# IMDb Sentiment Classification - Deep Learning Approach
Here the IMDb sentiment classification project is extended to a deep learning approach. First, an LSTM model is used with default parameters as a baseline. This initial test achieves a validation accuracy of 69%.

In [None]:
from imdb_classification.data import load_imdb_data
from imdb_classification.data_dl import IMDbSentimentLSTM, seed_everything
from imdb_classification.data_dl import create_loaders, train_model, plot_training

seed_everything(4)
data_dir = 'imdb_classifier/data/' # Replace
data_train = load_imdb_data(data_dir, subset = 'train')
data_test = load_imdb_data(data_dir, subset = 'test')

In [None]:
max_len = 200
batch_size = 32 
train_fraction = 0.8
embed_dim = 100 
hidden_dim = 128
num_layers = 1
dropout = 0.3

train_loader, val_loader, word2idx =\
create_loaders(data_train, max_len = max_len, batch_size = batch_size, 
                   train_fraction = train_fraction)

model = IMDbSentimentLSTM(word2idx = word2idx, embed_dim = embed_dim, 
                          hidden_dim = hidden_dim, num_layers = num_layers, 
                          dropout = dropout, glove = False, 
                          bidirectional = False)

In [None]:
device = 'cuda' # 5-10X faster on GPU on the computer this was tested on
epochs = 10
lr = 1e-3

history = train_model(model, train_loader, val_loader = val_loader, 
                      epochs = epochs, lr = lr, device = device, printout = False)

In [None]:
plot_training(history)

# GloVe
The model is rebuilt using embeddings from [GloVe](https://nlp.stanford.edu/projects/glove/), which relate similar words to one another. This speeds up the initial few epochs of learning, but ultimately does not improve the final validation accuracy meaningfully.
Some things I learned by changing parameters:
<ul>
    <li>Decreasing hidden_dim to look for simpler/more general features did not change the behavior meaningfully on its own.</li>
    <li> Increasing dropout from 0.3 to 0.5 did not make a meaningful difference.</li>
    <li>Decreasing the learning rate from 0.001 to 0.0005 does not make a meaningful difference on its own.</li>
    <li>Switching to a bidirectional LSTM does not make a meaningful difference on its own.</li>
    <li>Lowering the learning rate to 0.0005 in combination with decreasing hidden_dim to 64 does not make a meaningful difference.</li>
    <li>Lower the learning rate to 0.0005 in combination with increasing dropout to 0.5 doesn't meaningfully change the behavior.</li>
    <li> Lowering the learning rate to 0.0005 and increasing dropout to 0.5 and decreasing `hidden_dim` to 64 does not meaningfully change the behavior. </li>
    <li> Mean pooling does not meaningfully change the behavior. </li>
    <li> Increasing the number of layers to 2 does not meaningfully change the behavior. </li>
</ul>
The best validation accuracy this model is able to achieve is 84%. The parameter optimization sped up the training to just a couple epochs, but did not improve the overall accuracy. The parameters are left in the state that achieves this accuracy.

In [None]:
from imdb_classification.data import load_imdb_data
from imdb_classification.data_dl import IMDbSentimentLSTM, seed_everything
from imdb_classification.data_dl import create_loaders, train_model, plot_training

seed_everything(4)
data_dir = 'imdb_classifier/data/' # replace
glove_path = 'imdb_classifier/data/glove.6B.100d.txt' # replace
data_train = load_imdb_data(data_dir, subset = 'train')
data_test = load_imdb_data(data_dir, subset = 'test')

In [None]:
max_len = 200
batch_size = 32 
train_fraction = 0.8
embed_dim = 100 
hidden_dim = 32 
num_layers = 2 
dropout = 0.5
bidirectional = True
pool = True

train_loader, val_loader, word2idx =\
create_loaders(data_train, max_len = max_len, batch_size = batch_size, 
                   train_fraction = train_fraction)

model = IMDbSentimentLSTM(word2idx = word2idx, embed_dim = embed_dim, 
                          hidden_dim = hidden_dim, num_layers = num_layers, 
                          dropout = dropout, glove = True, pool = pool,
                          glove_path = glove_path, bidirectional = bidirectional)

In [None]:
device = 'cuda' # 5-10x faster on GPU than CPU on the computer this was tested on
epochs = 10
lr = 1e-3 

history = train_model(model, train_loader, val_loader = val_loader, 
                      epochs = epochs, lr = lr, device = device, printout = False)

In [None]:
fig, axs = plot_training(history)

# Run evaluation on test dataset
The model achieves 82% accuracy on the test dataset, not far from the 84% accuracy it achieves on the validation dataset. Further fine-tuning could boost these numbers by a few percent, but overall it seems that order 85% accuracy is about the limit of what this model can achieve.

In [None]:
from imdb_classification.data_dl import IMDbDataset, evaluate_model
from torch.utils.data import DataLoader
import torch
test_dataset = IMDbDataset(data_test, max_len = 200, word2idx = word2idx)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
criterion = torch.nn.BCEWithLogitsLoss() # binary criterion
device = 'cuda'

In [None]:
test_loss, test_acc = evaluate_model(model, test_loader, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_acc:.4f}")