# Advanced Convolutional Neural Networks (CNN) - 2
- Objective: try different structures of CNNs
- Note: examples are performed on **i5 7600 + gtx 1060 6GB **

## CNN for Sentence Classification
- It is widely known that CNNs are good for snapshot-like data, like images
- However, CNNs are effectve for NLP tasks as well
- For more information, refer to:
    - Kim 2014 (http://emnlp2014.org/papers/pdf/EMNLP2014181.pdf)
    - Zhang et al 2015 (https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf)
    
<br>
- In this section, we perform sentence classification with CNNs (Kim 2014)
</br>
<img src="http://d3kbpzbmcynnmx.cloudfront.net/wp-content/uploads/2015/11/Screen-Shot-2015-11-06-at-8.03.47-AM.png" style="width: 800px"/>

<br>
- Pixels are made of embedding vectors of each word in a sentence
- Convolutions are performed based on word-level
- Classify each sentence as positive (1) or negative (0)

<img src="http://d3kbpzbmcynnmx.cloudfront.net/wp-content/uploads/2015/11/Screen-Shot-2015-11-06-at-12.05.40-PM.png" style="width: 600px"/>

In [12]:
import numpy as np
import matplotlib.pyplot as plt

# from keras.datasets import imdb
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences

## Load Dataset
- IMDb Movie reviews sentiment classification Dataset
- Doc: https://keras.io/datasets/
- Parameter description
    - num_features: number of words to account for (i.e., only frequent n words are considered)
    - sequence_length: maximum number of words for a sentence (if sentence is too short, pad by zeros)
    - embedding_dimension: dimensionality of embedding space (i.e., dimensionality of vector representation for each word)

In [10]:
num_features = 3000
sequence_length = 300
embedding_dimension = 100

In [15]:
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words = num_features)


TypeError: <lambda>() got multiple values for keyword argument 'allow_pickle'

In [None]:
X_train = pad_sequences(X_train, maxlen = sequence_length)
X_test = pad_sequences(X_test, maxlen = sequence_length)

In [None]:
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

## 0. Basic CNN sentence classificationmodel
- Basic CNN using 1D convolution and pooling
- Known as "temporal convolution"

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Conv1D, MaxPooling1D, Embedding, Flatten
from keras import optimizers

In [None]:
def imdb_cnn():
    model = Sequential()
    
    # use Embedding layer to create vector representation of each word => it is fine-tuned every iteration
    model.add(Embedding(input_dim = 3000, output_dim = embedding_dimension, input_length = sequence_length))
    model.add(Conv1D(filters = 50, kernel_size = 5, strides = 1, padding = 'valid'))
    model.add(MaxPooling1D(2, padding = 'valid'))
    
    model.add(Flatten())
    
    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))
    
    adam = optimizers.Adam(lr = 0.001)
    
    model.compile(loss='binary_crossentropy', optimizer=adam , metrics=['accuracy'])
    
    return model

In [None]:
model = imdb_cnn()

In [None]:
%%time
history = model.fit(X_train, y_train, batch_size = 50, epochs = 100, validation_split = 0.2, verbose = 0)

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc = 'upper left')
plt.show()

In [None]:
results = model.evaluate(X_test, y_test)

In [None]:
print('Test accuracy: ', results[1])

## 1. Advanced CNN sentence classification model - 1
- Advanced CNN using 2D convolution and pooling
    - Embedding layer is "reshaped" to 4D to fit into 2D convolutional layer
- Perform global max pooling for each window

In [None]:
from keras.layers import Reshape, Conv2D, GlobalMaxPooling2D

In [None]:
def imdb_cnn_2():
    model = Sequential()

    model.add(Embedding(input_dim = 3000, output_dim = embedding_dimension, input_length = sequence_length))
    model.add(Reshape((sequence_length, embedding_dimension, 1), input_shape = (sequence_length, embedding_dimension)))
    model.add(Conv2D(filters = 50, kernel_size = (5, embedding_dimension), strides = (1,1), padding = 'valid'))
    model.add(GlobalMaxPooling2D())

    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    adam = optimizers.Adam(lr = 0.001)

    model.compile(loss='binary_crossentropy', optimizer=adam , metrics=['accuracy'])
    
    return model

In [None]:
model = imdb_cnn_2()

In [None]:
%%time
history = model.fit(X_train, y_train, batch_size = 50, epochs = 100, validation_split = 0.2, verbose = 0)

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc = 'upper left')
plt.show()

In [None]:
results = model.evaluate(X_test, y_test)

In [None]:
print('Test accuracy: ', results[1])

## 3. Advanced CNN sentence classification model - 2
- Structure more similar to that proposed in **Kim 2014**
    - Three convoltion operations with different filter sizes are performed and their results are merged

In [None]:
from keras.models import Model
from keras.layers import concatenate, Input

In [None]:
filter_sizes = [3, 4, 5]

In [None]:
def convolution():
    inn = Input(shape = (sequence_length, embedding_dimension, 1))
    convolutions = []
    # we conduct three convolutions & poolings then concatenate them.
    for fs in filter_sizes:
        conv = Conv2D(filters = 100, kernel_size = (fs, embedding_dimension), strides = 1, padding = "valid")(inn)
        nonlinearity = Activation('relu')(conv)
        maxpool = MaxPooling2D(pool_size = (sequence_length - fs + 1, 1), padding = "valid")(nonlinearity)
        convolutions.append(maxpool)
        
    outt = concatenate(convolutions)
    model = Model(inputs = inn, outputs = outt)
        
    return model

In [None]:
def imdb_cnn_3():
    
    model = Sequential()
    model.add(Embedding(input_dim = 3000, output_dim = embedding_dimension, input_length = sequence_length))
    model.add(Reshape((sequence_length, embedding_dimension, 1), input_shape = (sequence_length, embedding_dimension)))
    
    # call convolution method defined above
    model.add(convolution())
    
    model.add(Flatten())
    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    adam = optimizers.Adam(lr = 0.001)

    model.compile(loss='binary_crossentropy', optimizer=adam , metrics=['accuracy'])
    
    return model

In [None]:
model = imdb_cnn_3()

In [None]:
%%time
history = model.fit(X_train, y_train, batch_size = 50, epochs = 100, validation_split = 0.2, verbose = 0)

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc = 'upper left')
plt.show()

In [None]:
results = model.evaluate(X_test, y_test)

In [None]:
print('Test accuracy: ', results[1])

## 3. Advanced CNN sentence classification model - 3
- Structure more similar to that proposed in **Kim 2014**
    - More techniques are applied to generate more stable results

In [None]:
from keras.layers import BatchNormalization

In [None]:
filter_sizes = [3, 4, 5]

In [None]:
def convolution():
    inn = Input(shape = (sequence_length, embedding_dimension, 1))
    convolutions = []
    # we conduct three convolutions & poolings then concatenate them.
    for fs in filter_sizes:
        conv = Conv2D(filters = 100, kernel_size = (fs, embedding_dimension), strides = 1, padding = "valid")(inn)
        nonlinearity = Activation('relu')(conv)
        maxpool = MaxPooling2D(pool_size = (sequence_length - fs + 1, 1), padding = "valid")(nonlinearity)
        convolutions.append(maxpool)
        
    outt = concatenate(convolutions)
    model = Model(inputs = inn, outputs = outt)
        
    return model

In [None]:
def imdb_cnn_4():
    
    model = Sequential()
    model.add(Embedding(input_dim = 3000, output_dim = embedding_dimension, input_length = sequence_length))
    model.add(Reshape((sequence_length, embedding_dimension, 1), input_shape = (sequence_length, embedding_dimension)))
    model.add(Dropout(0.5))
    # call convolution method defined above
    model.add(convolution())
    
    model.add(Flatten())
    model.add(Dense(10))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(10))
    model.add(Activation('relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    adam = optimizers.Adam(lr = 0.001)

    model.compile(loss='binary_crossentropy', optimizer=adam , metrics=['accuracy'])
    
    return model

In [None]:
model = imdb_cnn_4()

In [None]:
%%time
history = model.fit(X_train, y_train, batch_size = 50, epochs = 100, validation_split = 0.2, verbose = 0)

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc = 'upper left')
plt.show()

In [None]:
results = model.evaluate(X_test, y_test)

In [None]:
print('Test accuracy: ', results[1])