# CNN-for-Sentence-Classification-in-Chainer

Implementation of Yoon Kim's [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882) with Chainer.

> Abstract (from Cornell university library)
>We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.

![](./img/structure.png)


In [232]:
import numpy as np
import importlib
import cnn_sentence
import data_builder
import models
importlib.reload(cnn_sentence)
importlib.reload(data_builder)
importlib.reload(models)

<module 'models' from '/Users/atsuya/Documents/CNN-for-Sentence-Classification-in-Chainer/models.py'>

In [227]:
# load imdb data
data = data_builder.load_imdb_data()
print(data.get_info())

Read Files..: 100%|██████████| 2000/2000 [00:04<00:00, 404.22it/s]
Padding: 100%|██████████| 2000/2000 [00:00<00:00, 27393.16it/s]


Data Info imdb
------------------------------
Vocab: 40666
Sentences: 2000
------------------------------
x_train: (1000, 1, 2460)
x_test: (1000, 1, 2460)
y_train: (1000,)
y_test: (1000,)



In [228]:
data.sentences

array([['assume', 'nothing', 'the', ..., '<PAD/>', '<PAD/>', '<PAD/>'],
       ['plot', 'derek', 'zoolander', ..., '<PAD/>', '<PAD/>', '<PAD/>'],
       ['i', 'actually', 'am', ..., '<PAD/>', '<PAD/>', '<PAD/>'],
       ...,
       ['coinciding', 'with', 'the', ..., '<PAD/>', '<PAD/>', '<PAD/>'],
       ['and', 'now', 'the', ..., '<PAD/>', '<PAD/>', '<PAD/>'],
       ['battlefield', 'long', 'boring', ..., '<PAD/>', '<PAD/>',
        '<PAD/>']], dtype='<U25')

In [229]:
data.x_train

array([[[   20,     5,  2576, ...,     0,     0,     0]],

       [[33285,   209,  2054, ...,     0,     0,     0]],

       [[   64,   300,  1090, ...,     0,     0,     0]],

       ...,

       [[ 2828,    30, 13853, ...,     0,     0,     0]],

       [[ 3126,     5,   672, ...,     0,     0,     0]],

       [[  403,     5,    37, ...,     0,     0,     0]]])

In [None]:
# build cnn model
model = L.Classifier(models.cnn["CNN_rand"]([3, 8, 12], data.n_vocab))

train, test = data.get_chainer_dataset()
train_iter = chainer.iterators.SerialIterator(train, 64)
test_iter = chainer.iterators.SerialIterator(test, 64, repeat=False, shuffle=False)
optimizer = chainer.optimizers.Adam().setup(model)
updater = training.StandardUpdater(train_iter, optimizer, device=-1)

# build trainer
trainer = training.Trainer(updater, (5, 'epoch'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model, device=-1))
trainer.extend(extensions.snapshot(), trigger=(20, 'epoch'))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(
['epoch', 'elapsed_time', 'main/loss', 'validation/main/loss',
 'main/accuracy', 'validation/main/accuracy']))

chainer.config.train = True
trainer.run()

epoch       elapsed_time  main/loss   validation/main/loss  main/accuracy  validation/main/accuracy
[J1           273.554       0.850728    0.694556              0.506836       0.501758                  
