Skip to content

A basic but robust linguistic embedding with part-of-speech POS-tagging that captures semantic and syntactic information with on a Long Short Term Memory model. The validation of my solution consist of adding unsupervised linguistic embedding representation such as part-of-speech results in an improvement applying LSTM.

Notifications You must be signed in to change notification settings

cdibut/Linguistic-embedding-for-language-understanding

Repository files navigation


# Machine comprehension with LSTM 

### Requirements
- [Torch7](https://github.com/torch/torch7)
- [nn](https://github.com/torch/nn)
- [nngraph](https://github.com/torch/nngraph)
- [optim](https://github.com/torch/optim)
- [parallel](https://github.com/clementfarabet/lua---parallel)
- Python 2.7
- Python Packages: [NLTK](http://www.nltk.org/install.html), collections, json, argparse
- [NLTK Data](http://www.nltk.org/data.html): punkt
-[Gensim] (https://radimrehurek.com/gensim/install)
- Multiple-cores CPU


### Datasets
- [Stanford Question Answering Dataset (SQuAD)](https://rajpurkar.github.io/SQuAD-explorer/)
- [GloVe: Global Vectors for Word Representation](http://nlp.stanford.edu/data/glove.840B.300d.zip)
- [word2vec Wikipedia : enwiki-latest-pages-articles.xml.bz2]
- [word2vec Google-News: GoogleNews-vectors-negative300.bin.gz]

### Usage

cd main th mainDt.lua

About

A basic but robust linguistic embedding with part-of-speech POS-tagging that captures semantic and syntactic information with on a Long Short Term Memory model. The validation of my solution consist of adding unsupervised linguistic embedding representation such as part-of-speech results in an improvement applying LSTM.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published