# Sentence classification
- http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
- https://github.com/bhaveshoswal/CNN-text-classification-keras

### Data
- Movie review data from Rotten Tomatoes (http://www.cs.cornell.edu/people/pabo/movie-review-data/)

In [2]:
from keras.layers import Input, Dense, Embedding, Conv2D, MaxPool2D
from keras.layers import Reshape, Flatten, Dropout, Concatenate
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
from keras.models import Model
from sklearn.model_selection import train_test_split
from data_helpers import load_data

Using TensorFlow backend.


### Read the data

In [3]:
x, y, vocabulary, vocabulary_inv = load_data()

# x.shape -> (10662, 56)
# y.shape -> (10662, 2)
# len(vocabulary) -> 18765
# len(vocabulary_inv) -> 18765

X_train, X_test, y_train, y_test = train_test_split( x, y, test_size=0.2, random_state=42)

# X_train.shape -> (8529, 56)
# y_train.shape -> (8529, 2)
# X_test.shape -> (2133, 56)
# y_test.shape -> (2133, 2)

In [4]:
X_train[0]

array([ 5101, 10576, 12028, 17798, 13159,  4196,   292,   474, 17948,
         475,  9570, 14759,  1059, 18393,  1080, 14902, 16433,  6957,
         480,  4963, 16900,  7972, 16683,  9766, 17792, 11488,   480,
        2171,  5138,   474,  6771, 14530,   475,   473,   473,   473,
         473,   473,   473,   473,   473,   473,   473,   473,   473,
         473,   473,   473,   473,   473,   473,   473,   473,   473,
         473,   473])

In [5]:
vocabulary_inv[473]

'<PAD/>'

### Hyperparameters

In [6]:
sequence_length = x.shape[1] # 56
vocabulary_size = len(vocabulary_inv) # 18765
embedding_dim = 128
filter_sizes = [3,4,5]
num_filters = 64
drop = 0.5

epochs = 10
batch_size = 30

### Model design
Keras에는 두 가지의 모델 생성 방법이 있습니다.

1. Sequential Models
2. Functional Models

**Sequential model API**는 상당히 쉽게 딥러닝 모델을 생성하는 인터페이스를 제공하지만 한 방향성으로만 모델을 생성시킨다는 단점이 있습니다. 따라서 다음의 경우에는 Sequential model API로 모델을 생성하기가 어렵습니다.

1. 다중의 입력 소스를 만들 경우
2. 다중의 출력 층을 만들 경우
3. 층을 여러 방향으로 공유하는 경우 등.

또 다른 방법은 **Functional model API**를 이용하는 것입니다. 이 방법은 좀 더 유연하게 딥러닝 모델을 디자인할 수 있게 합니다.
만드는 것은 전혀 어렵지 않습니다. `keras.models.Model`을 활용하여 생성할 수 있으며 **Input**과 **Output**만 잘 정의해주면 됩니다.

**Functional model API**에 대한 자세한 가이드는 Keras 공식 문서 (https://keras.io/getting-started/functional-api-guide/)를 참고하시기 바랍니다.

아래는 **Functional model API**로 모델을 생성한 경우입니다.

In [7]:
inputs = Input(shape=(sequence_length,), dtype='int32')
embedding = Embedding(input_dim=vocabulary_size, output_dim=embedding_dim, input_length=sequence_length)(inputs)
reshape = Reshape((sequence_length,embedding_dim,1))(embedding)

conv_0 = Conv2D(num_filters, kernel_size=(filter_sizes[0], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_1 = Conv2D(num_filters, kernel_size=(filter_sizes[1], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_2 = Conv2D(num_filters, kernel_size=(filter_sizes[2], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)

maxpool_0 = MaxPool2D(pool_size=(sequence_length - filter_sizes[0] + 1, 1), strides=(1,1), padding='valid')(conv_0)
maxpool_1 = MaxPool2D(pool_size=(sequence_length - filter_sizes[1] + 1, 1), strides=(1,1), padding='valid')(conv_1)
maxpool_2 = MaxPool2D(pool_size=(sequence_length - filter_sizes[2] + 1, 1), strides=(1,1), padding='valid')(conv_2)

concatenated_tensor = Concatenate(axis=1)([maxpool_0, maxpool_1, maxpool_2])
flatten = Flatten()(concatenated_tensor)
dropout = Dropout(drop)(flatten)
output = Dense(units=2, activation='softmax')(dropout)

# this creates a model that includes
model = Model(inputs=inputs, outputs=output)

In [10]:
print(inputs)
print(embedding)
print(reshape)

Tensor("input_1:0", shape=(?, 56), dtype=int32)
Tensor("embedding_1/GatherV2:0", shape=(?, 56, 128), dtype=float32)
Tensor("reshape_1/Reshape:0", shape=(?, 56, 128, 1), dtype=float32)


In [8]:
checkpoint = ModelCheckpoint('weights.{epoch:03d}-{val_acc:.4f}.hdf5', monitor='val_acc', verbose=1, save_best_only=True, mode='auto')
adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy'])
print("Traning Model...")
model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, callbacks=[checkpoint], validation_data=(X_test, y_test))  # starts training

Traning Model...
Train on 8529 samples, validate on 2133 samples
Epoch 1/10

Epoch 00001: val_acc improved from -inf to 0.57806, saving model to weights.001-0.5781.hdf5
Epoch 2/10

Epoch 00002: val_acc improved from 0.57806 to 0.58556, saving model to weights.002-0.5856.hdf5
Epoch 3/10

Epoch 00003: val_acc improved from 0.58556 to 0.64088, saving model to weights.003-0.6409.hdf5
Epoch 4/10

Epoch 00004: val_acc improved from 0.64088 to 0.69573, saving model to weights.004-0.6957.hdf5
Epoch 5/10

Epoch 00005: val_acc improved from 0.69573 to 0.70558, saving model to weights.005-0.7056.hdf5
Epoch 6/10

Epoch 00006: val_acc improved from 0.70558 to 0.72480, saving model to weights.006-0.7248.hdf5
Epoch 7/10

Epoch 00007: val_acc improved from 0.72480 to 0.73230, saving model to weights.007-0.7323.hdf5
Epoch 8/10

Epoch 00008: val_acc improved from 0.73230 to 0.74777, saving model to weights.008-0.7478.hdf5
Epoch 9/10

Epoch 00009: val_acc improved from 0.74777 to 0.75246, saving model to

<keras.callbacks.History at 0x2321869438>

In [9]:
import numpy as np
for i in range(5):
    idx = np.random.randint(len(X_test))
    x_test = X_test[idx].reshape(1,56)
    y_label = y_test[idx][0]
    y_pred = model.predict(x_test)[0][0]
    sent = " ".join([vocabulary_inv[x] for x in x_test[0].tolist() if x != 0])
    print("%.0f\t%d\t%s" % (y_pred, y_label, sent))

1	1	more of a career curio than a major work <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/>
1	1	this movie seems to have been written using mad libs there can be no other explanation hilariously inept and ridiculous <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/>
0	0	as surreal as a dream and as detailed as a photograph , as visually dexterous as it is at times imaginatively overwhelming <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD/> <PAD