# 9. DCNN - A Convolutional Neural Network for Modelling Sentences
This is another usage of CNN for sentence classification. It's a little bit different from CNN by Kim (2014), but basic idea is similart to Kim. I failed to reproduce DCNN with TensorFlow. Anyway, if you're interested, you could take a look at my code.

### References
- [A Convolutional Neural Network for Modelling Snentece - Kalchbrenner et al. 2014](https://arxiv.org/abs/1404.2188)

## Data Preprocessing
Preprocessing codes and processed data are borrowed from [harvardnlp/sent-conv-torch](https://github.com/harvardnlp/sent-conv-torch).

It's getting harder and harder to preprecess data in our model class. So we will preprocess before using `fit_to_corpus()` method as far as we can.

You have to select among these datasets `MR/SST1/SST2/Subj/TREC/CR/MPQA`.

Note that we are using phrases in SST1 for training though Kalchbrenner didn't use phrases in his paper. So there will be some differences in results from paper. But it'll make it easy for us to compare the results from DCNN with ones from CNN (Kim, 2014).

In [1]:
import data.sentiment_datasets.preprocess as preprocess
from models import DCNN

import random
import numpy as np

In [2]:
random.seed(1004)

We will not use pretrained word2vec here.

In [3]:
train, train_label, test, test_label, dev, dev_label, word_to_idx = preprocess.build_dataset("SST2", use_w2v=False)

loading data..
Vocab size: 16189
train size: (6920, 53)


In [4]:
train, test, dev, train_label, test_label, dev_label = \
    preprocess.train_test_dev_split(train, test, dev, train_label, test_label, dev_label)

In [5]:
train_data = [train, train_label, dev, dev_label, word_to_idx]
test_data = [test, test_label]

## Training!
I tried to reproduced the result from the paper but failed due to the lack of information of specific parameters used in producing the result in paper. If you're able to read theano code, look [FredericGodin/DynamicCNN
](https://github.com/FredericGodin/DynamicCNN).

In [6]:
model = DCNN.DCNN(batch_size = 20,
                  word_embedding_size = 48,
                  learning_rate = 0.001,
                  filter_windows = [7,5],
                  k_top = 4,
                  feature_maps = [6,14],
                  dropout_keep_prob=0.5)

In [7]:
model.fit_to_corpus(train_data)

Instructions for updating:
Use the retry module or similar alternatives.


In [8]:
model.train(4, save_dir="save/09_dcnn/sst2", log_dir="log/09_dcnn/sst2", print_every=300)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 778916
--------------------------------------------------------------------------------
000300: 1 [00300/00346], train_loss = 0.49299794, accuracy = 0.85000002, secs/batch = 0.0220
Epoch training time: 8.033319234848022

Finished Epoch 1
train_loss = 0.60728546, train_accruacy = 0.64595376
valid_loss = 0.45201360, valid_accuracy = 0.79302325

000646: 2 [00300/00346], train_loss = 0.42926663, accuracy = 0.80000001, secs/batch = 0.0215
Epoch training time: 6.976386547088623

Finished Epoch 2
train_loss = 0.28037934, train_accruacy = 0.89205202
valid_loss = 0.45281738, valid_accuracy = 0.80813954

000992: 3 [00300/00346], train_loss = 0.15770006, accuracy = 0.94999999, secs/batch = 0.0222
Epoch training time: 6.928498983383179

Finished Epoch 3
train_loss = 0.08518868, train_accruacy = 0.98049132
valid_loss = 0.66393911, valid_accuracy = 0.79418605

decaying learning

In [9]:
model.test(test_data, load_dir="save/09_dcnn/sst2")

INFO:tensorflow:Restoring parameters from save/09_dcnn/sst2/epoch004_0.7589.model
--------------------------------------------------------------------------------
Restored model from checkpoint for testing. Size: 778916
--------------------------------------------------------------------------------
test loss = 0.67550432, test accuracy = 0.81208791
test samples: 001820, time elapsed: 1.2287, time per one batch: 0.0135
