Dynamic Convolutional Neural Networks for Theano/Lasagne
Switch branches/tags
Nothing to show
Clone or download
FredericGodin Merge pull request #13 from andreh7/patch-1
fixing broken link to paper
Latest commit 694cf6c Jun 15, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
DCNN Faster version Oct 29, 2015
data/binarySentiment initial commit Sep 21, 2015
README.md fixing broken link to paper Jun 14, 2017
__init__.py initial commit Sep 21, 2015
dataUtils.py Faster version Oct 29, 2015
networks.py Faster version Oct 29, 2015
trainDCNN.py small fix regarding printing train accuracy Dec 29, 2015
utils.py initial commit Sep 21, 2015

README.md

Dynamic Convolutional Neural Networks

Introduction

This is a Theano implementation of the paper "A Convolutional Neural Network for Modelling Sentences" (click here). The example included is that of binary movie review sentiment classification (Stanford Sentiment Treebank). I was able to achieve a test set accuracy of 85-86% which is just below the reported accuracy of 86.8%.

Using it

To run it, simply run trainDCNN.py. If you are only interested in the layers such as Dynamic K-max pooling, or the 1D convolution, only use the DCNN package.

Paper/implementation issues

There is some discrepancy between the paper and Matlab code provided. Therefore, it was difficult to rely on the Matlab code for details not provided in the paper. For example: (1) different number of layers and filters. (2) the L2 regularization is not specified in the paper but is very detailed in the code (different values for different matrices). It would be hard to guess those values.

Implementation details

The layers of the network are wrapped as Lasagne layers and can be easily reused. In the paper, some layer types were introduced which are not trivial for Theano. (1) 1D convolution layers that only apply row wise convolutions and not on all rows at once. (2) Dynamic K-max pooling. Currently an argsort operation is used which is executed on the CPU. However, this operation is too heavy for selecting the K max values.

Because of these implementation issues, a very heavy GpuContiguous operation is automatically introduced somewhere. If you have a solution or comments, I'm happy to support pull requests ;)