GitHub - cdibut/Linguistic-embedding-for-language-understanding: A basic but robust linguistic embedding with part-of-speech POS-tagging that captures semantic and syntactic information with on a Long Short Term Memory model. The validation of my solution consist of adding unsupervised linguistic embedding representation such as part-of-speech results in an improvement applying LSTM.


# Machine comprehension with LSTM 

### Requirements
- [Torch7](https://github.com/torch/torch7)
- [nn](https://github.com/torch/nn)
- [nngraph](https://github.com/torch/nngraph)
- [optim](https://github.com/torch/optim)
- [parallel](https://github.com/clementfarabet/lua---parallel)
- Python 2.7
- Python Packages: [NLTK](http://www.nltk.org/install.html), collections, json, argparse
- [NLTK Data](http://www.nltk.org/data.html): punkt
-[Gensim] (https://radimrehurek.com/gensim/install)
- Multiple-cores CPU


### Datasets
- [Stanford Question Answering Dataset (SQuAD)](https://rajpurkar.github.io/SQuAD-explorer/)
- [GloVe: Global Vectors for Word Representation](http://nlp.stanford.edu/data/glove.840B.300d.zip)
- [word2vec Wikipedia : enwiki-latest-pages-articles.xml.bz2]
- [word2vec Google-News: GoogleNews-vectors-negative300.bin.gz]

### Usage

cd main th mainDt.lua

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
main		main
models		models
nn		nn
squad		squad
trainedmodel/evaluation/squad		trainedmodel/evaluation/squad
util		util
README.md		README.md
REP_DibutC_IT.pdf		REP_DibutC_IT.pdf
REP_DibutC_IT_bib.pdf		REP_DibutC_IT_bib.pdf
preprocess.py		preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

cdibut/Linguistic-embedding-for-language-understanding

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages