# Classification with Convolutional Neural Networks

## Douglas Rice

In this notebook, we'll introduce Convolutional Neural Network architecture. As before, building the models in Keras requires relatively straightforward modifications from our prior work. 


## Set Everything Up

As always, we start by getting our environment, loading in the modules and functionality that we'll need to estimate the model.

In [None]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers, models
from keras import models

max_features = 5000
maxlen = None  # This will pad shorter reviews to the length of the longest review. Set maxlen=200 or 500 for less padding at the expense of truncating the reviews.


## Load the IMDB movie review sentiment data

We'll stick with the IMDB movie review sentiment data that ships with Keras for this exercise. This also maintains oru ability to make relatively straightforward comparisons across all of these different modeling approaches.

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(
    num_words=max_features
)
x_val = x_train[:10000]
partial_x_train = x_train[10000:]
y_val = y_train[:10000]
partial_y_train = y_train[10000:]
print(len(x_train), "Training sequences")
print(len(x_test), "Test sequences")
partial_x_train = keras.preprocessing.sequence.pad_sequences(partial_x_train, maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=maxlen)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
25000 Training sequences
25000 Test sequences


## Build a basic CNN model

Now we'll build a basic CNN. To do so, we incorporate two new types of layers in our Sequential model. The first is a 1D convolution layer (`Conv1D` in the below). This layer creates a convolution kernel that is convolved with the layer input over a single temporal dimension. The second is a pooling layer (`GlobalMaxPooling1D`) that downsamples the input representation by taking the maximum value over the time dimension.

 

In [None]:
# Shallow CNN
model = models.Sequential()
model.add(layers.Input(shape=(None,), dtype="int32"))
model.add(layers.Embedding(max_features,16))
model.add(layers.Conv1D(128, 3, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, None, 16)          80000     
                                                                 
 conv1d (Conv1D)             (None, None, 128)         6272      
                                                                 
 global_max_pooling1d (Globa  (None, 128)              0         
 lMaxPooling1D)                                                  
                                                                 
 dense (Dense)               (None, 16)                2064      
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_1 (Dense)             (None, 1)                 17        
                                                        

## Train and evaluate the model

This takes about 15 minutes to run if you are not on a GPU.

In [None]:
model.compile("adam", "binary_crossentropy", metrics=["accuracy"])

In [None]:
model.fit(partial_x_train, partial_y_train, batch_size=512, epochs=12, validation_data=(x_val, y_val))

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.History at 0x7f5cd4d7abd0>

In [None]:
model.evaluate(x_test, y_test)



[0.3718074858188629, 0.8610000014305115]

We're at about 86%, similar to what we were seeing with the LSTM and Bi-LSTM. 

## Build a Deep CNN

Let's try a more complex setup, and build a deeper CNN with multiple convolution layers and multiple pooling layers. 

In [None]:
model = models.Sequential()
model.add(layers.Input(shape=(None,), dtype="int32"))
model.add(layers.Embedding(max_features,16))
model.add(layers.Conv1D(256, 3, activation='relu',padding='same'))
model.add(layers.MaxPooling1D())
model.add(layers.Conv1D(128, 3, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()


Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, None, 16)          80000     
                                                                 
 conv1d_1 (Conv1D)           (None, None, 256)         12544     
                                                                 
 max_pooling1d (MaxPooling1D  (None, None, 256)        0         
 )                                                               
                                                                 
 conv1d_2 (Conv1D)           (None, None, 128)         98432     
                                                                 
 global_max_pooling1d_1 (Glo  (None, 128)              0         
 balMaxPooling1D)                                                
                                                                 
 dense_2 (Dense)             (None, 16)               

In [None]:
model.compile("adam", "binary_crossentropy", metrics=["accuracy"])

In [None]:
model.fit(partial_x_train, partial_y_train, batch_size=512, epochs=12, validation_data=(x_val, y_val))

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.History at 0x7f5cd86f87d0>

In [None]:
model.evaluate(x_test, y_test)



[0.6400715708732605, 0.8174399733543396]


85%. Going in the wrong direction!