Skip to content
Codes for the ACL 2019 paper: Learning Compressed Sentence Representations for On-Device Text Processing.
Python
Branch: master
Clone or download
Latest commit 51b11b3 Nov 2, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md edit readme file Nov 2, 2019
config.py fix small mistakes Jul 10, 2019
discrete_encoders.py upload model training code Nov 2, 2019
evaluate.py fix small mistakes Jul 10, 2019
train.py upload model training code Nov 2, 2019

README.md

BinarySentEmb

Codes for the ACL 2019 paper: Learning Compressed Sentence Representations for On-Device Text Processing.

This repository contains source code necessary to reproduce the results presented in the following paper:

This project is maintained by Pengyu Cheng. Feel free to contact pengyu.cheng@duke.edu for any relevant issues.

Dependencies:

This code is written in python. The dependencies are:

  • Python 3.6
  • Pytorch>=0.4 (0.4.1 is recommended)
  • NLTK>=3

Download pretrained models:

First, download GloVe pretrained word embeddings:

mkdir dataset/GloVe
curl -Lo dataset/GloVe/glove.840B.300d.zip http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip dataset/GloVe/glove.840B.300d.zip -d dataset/GloVe/

Then, follow the instruction of InferSent to download pretrain universal sentence encoder:

mkdir encoder
curl -Lo encoder/infersent1.pkl https://dl.fbaipublicfiles.com/infersent/infersent1.pkl

Futhermore, download our pretrained binary sentence encoder from here. Make sure the binary encoder is also in the ./encoder/ folder.

Train a binary encoder

To train a binary sentence encoder, first download data.py, mutils.py, and models.py from InferSent.

Then, run the command:

python train.py

Evaluate the binary encoder on transfer tasks

Following the instruction of SentEval to download the sentence embeddings evaluation toolkit and datasets.

Download the original InferSent encoder model from here.

To reproduce results of our pretrained binary sentence encoder, run the command:

python evaluate.py

Citation

Please cite our ACL paper if you found the code useful.

@article{shen2019learning,
  title={Learning Compressed Sentence Representations for On-Device Text Processing},
  author={Shen, Dinghan and Cheng, Pengyu and Sundararaman, Dhanasekar and Zhang, Xinyuan and Yang, Qian and Tang, Meng and Celikyilmaz, Asli and Carin, Lawrence},
  journal={arXiv preprint arXiv:1906.08340},
  year={2019}
}
You can’t perform that action at this time.