Skip to content

A repository of POCs related to Deep Learning and NLP

Notifications You must be signed in to change notification settings

nitinvwaran/NLP-Deep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 

Repository files navigation

NLP-Deep

A repository of POCs related to Natural Language Processing using Deep Learning Frameworks

1. Sentence Classification: A Tensorflow Implementation of 'Text Understanding From Scratch' by Xiang Zhang and Yann LeCun
(paper: https://arxiv.org/abs/1502.01710)

The POC is in the jupyter notebook: 'Zhang-Text-Understanding-Scratch.ipynb'

This POC is an implementation in Tensorflow of the paper 'Text Understanding From Scratch' by Xiang Zhang and Yann LeCun, which uses Deep Convolutional Networks to classify sentences using Character-Level features present in the sentence. No Word-level features are used.
The main features of the network are a 1-D Convolution and a 1-D Temporal Max Pooling layer.
There are six 1-D Convolution layers, and three 1-D Max Pooling Layers, followed by three Fully Connected Layers.
Batch Normalization was applied in all these layers except the last fully connected layer, to avoid the vanishing gradients problem resulting from Deep ConvNets

The diagrammatic representation of the model, is as shown below from the original paper:
alt text
The dataset used is the Full Amazon Product Reviews Dataset, which contains three columns:

  1. The Review Number, on a scale of 1-5 (five labels)
  2. The Review Title
  3. The Review Description
    Only the Review Number and the Review Description are used.

For comparison, the accuracy reported by the Authors (using Torch) for the dataset is
Training: 62.96% (Sample size: 3,000,000)
Test: 58.69% (Sample size: 650,000)

The accuracy from this POC is as below:
Training: 63.45% (Sample size: 624,000)
Dev: 51.34% (Sample size: 20,000)
Test: 46.39% (Sample size: 135,340)

The training accuracy graph over time: alt text
The average cross-entropy training loss over time: alt text
The validation / dev accuracy graph over time: alt text

About

A repository of POCs related to Deep Learning and NLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published