Skip to content
Deep learning architectures to embed multi-word units into a vector space maximizing similarity between units of different sizes.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md updated results Jun 29, 2017
data_pair_neg initial S dataset Jun 20, 2017
data_pair_pos initial S dataset Jun 20, 2017
evals.py added evals Jul 14, 2017
preprocess_data.py preprocessing Jun 20, 2017
preprocess_data_lstm.py added preprocessing Jul 5, 2017
results.txt nearbly words n analogies Jul 5, 2017
siamese_lstm.py commented unused line L#73 May 14, 2019
siamese_mlp.py siamese mlp Jun 20, 2017
word2VecTF.py TF impl, Mikolov Jun 20, 2017

README.md

IIITH NLP Lab Summer Research Project

Beyond Word2Vec

  • Vijay Prakash Dwivedi
  • Dr Manish Shrivastava

Description

Building deep learning architectures to embed multi-word units into a vector space maximizing similarity between units of different sizes.

Work

Implemented a Siamese MLP architecture with following best results. Siamese MLP Architecture

* Training on 263000 samples, Testing on 113000 samples
* Accuracy on training set: 91.06%
* Accuracy on test set: 74.93%

Also, implemented a Siamese LSTM architecture with following best results. Siamese LSTM Architecture

* Training on 263000 samples, Testing on 113000 samples
* Accuracy on training set: 93.20%
* Accuracy on test set: 76.65%
You can’t perform that action at this time.