<div class="alert alert-block alert-info" style="margin-top: 20px">

      
| Name | Description | Date
| :- |-------------: | :-:
|Reza Hashemi| Convolutional Neural Networks - 7th  | Finalized on 23rd of August 2019 | width="750" align="center"></a></p>
</div>

# Convolutional Neural Networks
- IMDB review sentiment classification with CNN
  - Last time, we have started sentence classification with CNN having only one filter. Here, let's try more complicated CNN architecture with different filter sizes to ameliorate the performance.

In [0]:
!pip3 install torch torchvision



In [0]:
#!pip install numpy==1.16.2
import numpy as np
print(np.__version__)

import numpy as np
import pandas as pd
import torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
torch.__version__

1.16.2


'1.1.0'

## 1. Import & process dataset
- IMDB review dataset for sentiment analysis
  - [source](http://ai.stanford.edu/~amaas/data/sentiment/)
  - Let's cheat a while and use dataset provided by Keras

In [0]:
from keras.datasets import imdb
from keras.preprocessing import sequence

num_words = 10000
maxlen = 50

(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words = num_words)

X_train = sequence.pad_sequences(X_train, maxlen = maxlen, padding = 'pre')
X_test = sequence.pad_sequences(X_test, maxlen = maxlen, padding = 'pre')
    
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

Using TensorFlow backend.


(25000, 50) (25000, 50) (25000,) (25000,)


## 2. Creating CNN model and training

- Create and train CNN model for sentence classification, with three convolutional & max pooling layers concatenated in the end.
- Model architecture is adopted from [Kim 2015](https://www.aclweb.org/anthology/D14-1181)

![](https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2017/08/Example-of-a-CNN-Filter-and-Polling-Architecture-for-Natural-Language-Processing.png)

In [0]:
class imdbTrainDataset(torch.utils.data.Dataset):
  def __init__(self):
    self.X = X_train
    self.y = y_train
  
  def __getitem__(self, idx):
    return self.X[idx], self.y[idx]
  
  def __len__(self):
    return len(self.X)
  
class imdbTestDataset(torch.utils.data.Dataset):
  def __init__(self):
    self.X = X_test
    self.y = y_test
  
  def __getitem__(self, idx):
    return self.X[idx], self.y[idx]
  
  def __len__(self):
    return len(self.X)

In [0]:
# create dataset & dataloader instances
train_dataset = imdbTrainDataset()
test_dataset = imdbTestDataset()

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size = 128, shuffle = True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size = 128, shuffle = False)

In [0]:
# create CNN with one convolution/pooling layer
class net(nn.Module):
  def __init__(self, input_dim, num_words, embedding_dim, num_filters, kernel_size, stride):
    super(net, self).__init__()
    self.input_dim = input_dim
    self.embedding_dim = embedding_dim
    
    conv_output_size1 = int((input_dim - kernel_size[0])/stride) + 1   # first conv layer output size
    conv_output_size2 = int((input_dim - kernel_size[1])/stride) + 1   # first conv layer output size
    conv_output_size3 = int((input_dim - kernel_size[2])/stride) + 1   # first conv layer output size
        
    self.embedding = nn.Embedding(num_words, self.embedding_dim)
    
    # three convolution & pooling layers
    self.conv1 = nn.Conv2d(1, num_filters[0], kernel_size = (kernel_size[0], self.embedding_dim), stride = stride)     
    self.pool1 = nn.MaxPool2d((conv_output_size1, 1))                # Max-over-time pooling
    self.conv2 = nn.Conv2d(1, num_filters[1], kernel_size = (kernel_size[1], self.embedding_dim), stride = stride)     
    self.pool2 = nn.MaxPool2d((conv_output_size2, 1))                # Max-over-time pooling
    self.conv3 = nn.Conv2d(1, num_filters[2], kernel_size = (kernel_size[2], self.embedding_dim), stride = stride)     
    self.pool3 = nn.MaxPool2d((conv_output_size3, 1))                # Max-over-time pooling
    
    self.relu = nn.ReLU()
    self.dense = nn.Linear(num_filters[0] + num_filters[1] + num_filters[2], 2)     
    
  def forward(self, x):
    x = self.embedding(x)                                   # project to word embedding space
    x = x.view(-1, 1, self.input_dim, self.embedding_dim)   # resize to fit into convolutional layer
    x1 = self.pool1(self.relu(self.conv1(x)))
    x2 = self.pool2(self.relu(self.conv2(x)))
    x3 = self.pool3(self.relu(self.conv3(x)))

    x = torch.cat((x1, x2, x3), dim = 1)   # concatenate three convolutional outputs
    x = x.view(x.size(0), -1)   # resize to fit into final dense layer
    x = self.dense(x)
    return x

In [0]:
# hyperparameters
DEVICE = torch.device('cuda')
INPUT_DIM = maxlen
NUM_FILTERS = (16, 32, 64) 
KERNEL_SIZE = (1, 2, 3)
STRIDE = 1
EMBEDDING_DIM = 50
NUM_WORDS = num_words
LEARNING_RATE = 1e-3
NUM_EPOCHS = 30         

In [0]:
model = net(INPUT_DIM, NUM_WORDS, EMBEDDING_DIM, NUM_FILTERS, KERNEL_SIZE, STRIDE).to(DEVICE)
criterion = nn.CrossEntropyLoss()   # do not need softmax layer when using CEloss criterion
optimizer = torch.optim.Adam(model.parameters(), lr = LEARNING_RATE)

In [0]:
# training for NUM_EPOCHS
for i in range(NUM_EPOCHS):
  temp_loss = []
  for (x, y) in train_loader:
    x, y = x.long().to(DEVICE), y.to(DEVICE)  # beware that input to embedding should be type 'long'
    outputs = model(x)
    loss = criterion(outputs, y)
    temp_loss.append(loss.item())
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
  print("Loss at {}th epoch: {}".format(i, np.mean(temp_loss)))

Loss at 0th epoch: 0.6324634409072448
Loss at 1th epoch: 0.5018278180944676
Loss at 2th epoch: 0.4194177679565488
Loss at 3th epoch: 0.35011860849905985
Loss at 4th epoch: 0.29556686879724875
Loss at 5th epoch: 0.24471684887397047
Loss at 6th epoch: 0.19912726409277137
Loss at 7th epoch: 0.16028255231830538
Loss at 8th epoch: 0.12408019845583002
Loss at 9th epoch: 0.09528727776237897
Loss at 10th epoch: 0.07133282300997146
Loss at 11th epoch: 0.05265233642896827
Loss at 12th epoch: 0.038134651353620756
Loss at 13th epoch: 0.028006473964802464
Loss at 14th epoch: 0.020876914913747078
Loss at 15th epoch: 0.01609226120920966
Loss at 16th epoch: 0.0124357185012908
Loss at 17th epoch: 0.0097842464693917
Loss at 18th epoch: 0.00782318807406617
Loss at 19th epoch: 0.006367209493372665
Loss at 20th epoch: 0.005227200729696423
Loss at 21th epoch: 0.004337878479641311
Loss at 22th epoch: 0.003619923552127593
Loss at 23th epoch: 0.003041107853760525
Loss at 24th epoch: 0.002558622922634288
Loss a

## 3. Evaluation
- Evaluate the trained CNN model with accuracy score 
  - Store probability of each instance to a list and compare it with true y label

In [0]:
y_pred, y_true = [], []
with torch.no_grad():
  for x, y in test_loader:
    x, y = x.long().to(DEVICE), y.to(DEVICE)       # beware that input to embedding should be type 'long'
    outputs = F.softmax(model(x)).max(1)[-1]       # predicted label
    y_true += list(y.cpu().numpy())                # true label
    y_pred += list(outputs.cpu().numpy())   

  """


In [0]:
# evaluation result
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)

0.7956