# Convolutional Neural Networks
- IMDB review sentiment classification with CNN
  - Up to last time, we have tried to classify images in the CIFAR-10 dataset with CNN. However, CNNs are not only useful in classifying and recognizing images, but also processing data with temporal dependencies, such as text data.
  - Here, let's try classifying movie review data with CNN

In [1]:
!pip3 install torch torchvision

Collecting torch
[?25l  Downloading https://files.pythonhosted.org/packages/7e/60/66415660aa46b23b5e1b72bc762e816736ce8d7260213e22365af51e8f9c/torch-1.0.0-cp36-cp36m-manylinux1_x86_64.whl (591.8MB)
[K    100% |████████████████████████████████| 591.8MB 28kB/s 
tcmalloc: large alloc 1073750016 bytes == 0x61d2e000 @  0x7f29476992a4 0x591a07 0x5b5d56 0x502e9a 0x506859 0x502209 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x507641 0x502209 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x507641 0x504c28 0x502540 0x502f3d 0x507641
[?25hCollecting torchvision
[?25l  Downloading https://files.pythonhosted.org/packages/ca/0d/f00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1/torchvision-0.2.1-py2.py3-none-any.whl (54kB)
[K    100% |████████████████████████████████| 61kB 18.0MB/s 
Collecting pillow>=4.1.1 (from torchvision)
[?25l  Downloading https://files.pythonhosted.org/packages/62/94/5430ebaa83f91cc7a9f687f

In [1]:
import numpy as np
import pandas as pd
import torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
torch.__version__

'1.0.0'

## 1. Import & process dataset
- IMDB review dataset for sentiment analysis
  - [source](http://ai.stanford.edu/~amaas/data/sentiment/)
  - Let's cheat a while and use dataset provided by Keras

In [2]:
from keras.datasets import imdb
from keras.preprocessing import sequence

num_words = 10000
maxlen = 50

(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words = num_words)

X_train = sequence.pad_sequences(X_train, maxlen = maxlen, padding = 'pre')
X_test = sequence.pad_sequences(X_test, maxlen = maxlen, padding = 'pre')
    
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

Using TensorFlow backend.


(25000, 50) (25000, 50) (25000,) (25000,)


## 2. Creating CNN model and training

- Create and train CNN model for sentence classification, with one convolutional & average pooling layer
- Model architecture is adopted from [Kim 2015](https://www.aclweb.org/anthology/D14-1181)

![](https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2017/08/Example-of-a-CNN-Filter-and-Polling-Architecture-for-Natural-Language-Processing.png)

In [0]:
class imdbTrainDataset(torch.utils.data.Dataset):
  def __init__(self):
    self.X = X_train
    self.y = y_train
  
  def __getitem__(self, idx):
    return self.X[idx], self.y[idx]
  
  def __len__(self):
    return len(self.X)
  
class imdbTestDataset(torch.utils.data.Dataset):
  def __init__(self):
    self.X = X_test
    self.y = y_test
  
  def __getitem__(self, idx):
    return self.X[idx], self.y[idx]
  
  def __len__(self):
    return len(self.X)

In [0]:
# create dataset & dataloader instances
train_dataset = imdbTrainDataset()
test_dataset = imdbTestDataset()

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size = 128, shuffle = True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size = 128, shuffle = False)

In [0]:
# create CNN with one convolution/pooling layer
class net(nn.Module):
  def __init__(self, input_dim, num_words, embedding_dim, num_filters, kernel_size, stride):
    super(net, self).__init__()
    self.input_dim = input_dim
    self.embedding_dim = embedding_dim
    
    conv_output_size = int((input_dim - kernel_size)/stride) + 1   # first conv layer output size
        
    self.embedding = nn.Embedding(num_words, self.embedding_dim)
    self.conv = nn.Conv2d(1, num_filters, kernel_size = (kernel_size, self.embedding_dim), stride = stride)     
    self.pool = nn.MaxPool2d((conv_output_size, 1))                # Max-over-time pooling (FYI: avg pooling also works)
    self.relu = nn.ReLU()
    self.dense = nn.Linear(num_filters, 2)     
    
  def forward(self, x):
    x = self.embedding(x)                                   # project to word embedding space
    x = x.view(-1, 1, self.input_dim, self.embedding_dim)   # resize to fit into convolutional layer
    x = self.conv(x)
    x = self.relu(x)
    x = self.pool(x) 
    x = x.view(x.size(0), -1)   # resize to fit into final dense layer
    x = self.dense(x)
    return x

In [0]:
# hyperparameters
DEVICE = torch.device('cuda')
INPUT_DIM = maxlen
NUM_FILTERS = 64
KERNEL_SIZE = 3
STRIDE = 1
EMBEDDING_DIM = 50
NUM_WORDS = num_words
LEARNING_RATE = 1e-3
NUM_EPOCHS = 30              

In [0]:
model = net(INPUT_DIM, NUM_WORDS, EMBEDDING_DIM, NUM_FILTERS, KERNEL_SIZE, STRIDE).to(DEVICE)
criterion = nn.CrossEntropyLoss()   # do not need softmax layer when using CEloss criterion
optimizer = torch.optim.Adam(model.parameters(), lr = LEARNING_RATE)

In [14]:
# training for NUM_EPOCHS
for i in range(NUM_EPOCHS):
  temp_loss = []
  for (x, y) in train_loader:
    x, y = x.long().to(DEVICE), y.to(DEVICE)  # beware that input to embedding should be type 'long'
    outputs = model(x)
    loss = criterion(outputs, y)
    temp_loss.append(loss.item())
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
  print("Loss at {}th epoch: {}".format(i, np.mean(temp_loss)))

Loss at 0th epoch: 0.6467448935216787
Loss at 1th epoch: 0.5299239850469998
Loss at 2th epoch: 0.4440507980025544
Loss at 3th epoch: 0.38025763858946005
Loss at 4th epoch: 0.3228620941541633
Loss at 5th epoch: 0.2733502811467161
Loss at 6th epoch: 0.22932577779402538
Loss at 7th epoch: 0.18939032276370088
Loss at 8th epoch: 0.1521880773029157
Loss at 9th epoch: 0.12011083876904176
Loss at 10th epoch: 0.09256724133251273
Loss at 11th epoch: 0.07036281743904158
Loss at 12th epoch: 0.0526158744469285
Loss at 13th epoch: 0.03983275468784327
Loss at 14th epoch: 0.029970756969509686
Loss at 15th epoch: 0.022237090422410746
Loss at 16th epoch: 0.016928160087946727
Loss at 17th epoch: 0.013268464244902134
Loss at 18th epoch: 0.01047601109627178
Loss at 19th epoch: 0.008334757876582444
Loss at 20th epoch: 0.00675514917606868
Loss at 21th epoch: 0.005520799474039932
Loss at 22th epoch: 0.004574792754209163
Loss at 23th epoch: 0.003809949959276662
Loss at 24th epoch: 0.0031995022270296302
Loss at

## 3. Evaluation
- Evaluate the trained CNN model with accuracy score 
  - Store probability of each instance to a list and compare it with true y label

In [9]:
y_pred, y_true = [], []
with torch.no_grad():
  for x, y in test_loader:
    x, y = x.long().to(DEVICE), y.to(DEVICE)       # beware that input to embedding should be type 'long'
    outputs = F.softmax(model(x)).max(1)[-1]       # predicted label
    y_true += list(y.cpu().numpy())                # true label
    y_pred += list(outputs.cpu().numpy())   

  """


In [10]:
# evaluation result
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)

0.77464