# 1D CNN for Sequence Classification

Install keras library, if not already installed.

Import libaries and use the built-in keras dataset.
The datset contains movie reviews with the binary sentiment labels (positive:1, negative:0)attached to each review.
The reference is [here](https://ai.stanford.edu/~amaas/data/sentiment/).

In [1]:
from keras.datasets import imdb
from keras.preprocessing import sequence

Preprocessing the dataset, and limiting the number of words and the number of reviews.

In [2]:
max_features = 10000
max_len = 500
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')

print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=max_len)
x_test = sequence.pad_sequences(x_test, maxlen=max_len)

print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
x_train shape: (25000, 500)
x_test shape: (25000, 500)


Here we build layers of Convolutional Network, in 1 dimension.
Each sliding frame of size 7 will convolve (using a convolution vector) into a single value, and the netork will learn the weights.

In [3]:
from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop
model = Sequential()
model.add(layers.Embedding(max_features, 128, input_length=max_len))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 500, 128)          1280000   
_________________________________________________________________
conv1d (Conv1D)              (None, 494, 32)           28704     
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 98, 32)            0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 92, 32)            7200      
_________________________________________________________________
global_max_pooling1d (Global (None, 32)                0         
_________________________________________________________________
dense (Dense)                (None, 1)                 33        
Total params: 1,315,937
Trainable params: 1,315,937
Non-trainable params: 0
______________________________________________

In [4]:
model.compile(optimizer=RMSprop(lr=1e-4),
              loss='binary_crossentropy',
              metrics=['acc'])
history = model.fit(x_train, y_train,
                    epochs=10,
                    batch_size=128,
                    validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


This is a convincing demonstration that a 1D CNN can offer a fast, cheap alternative to a recurrent network on a word-level sentiment-classification task.

Note however that a much better accuracy was obtained in my lab on binary sentiment classification ([link](https://github.com/Xiaoyi-ZHANG23/labs_ml_naive_bayes/blob/main/submit.ipynb)) using simple Naive Bayes.

Adopted from the blog: Convolutional Neural Networks for Sequence Processing: Part 1. Implementing a 1D CNN.
Web [link](https://froiland.medium.com/convolutional-neural-networks-for-sequence-processing-part-1-420dd9b500).Retrieved on Apr 25, 2023.