This repository contains some popular deep learning models for sentence representation (also apply for document-level text) that built in PyTorch. Intended for learning PyTorch, this repo is made understandable for someone with basic python and deep learning knowledge. Links to some papers are also given.
- python 2.7
- pytorch 0.2
- torchtext 0.2
python train.py -conf [config file]
Choose the config file that used to set the datasets and models.
- model file:
- model/model.py, contains the deep models for sentence representation.
- training framework: train.py - preprocesses the data and trains the model.
- configuration files:
- i.e. trec/trec.conf, the config file used to set the datasets and models.
- help function: utils/utils.py. some helper functions.
For now, the models listed bellow are add into this repo. Some benchmarks for these models are also given (the hyper-parameters are far from being optimal, the performances of these models can be improved with carefully tuning).
Model | TREC6-valid1 | TREC6-test | SST2-valid2 | SST2-test |
---|---|---|---|---|
LSTM | - | 94.6 | 84.98 | 85.45 |
Bi-LSTM | - | 94.4 | 85.21 | 86.44 |
CNN | - | 95.2 | 84.63 | 84.73 |
SelfAttn | - | 96.0 | 85.44 | 86.66 |
BCN+CoVe | - | 95.0 | 87.55 | 87.84 |
1: The best accuracy on test set is reported since it has no development set.
2: Only the sentence-level training samples are used.
- borrow some code from cnn-text-classification-pytorch
- Convolutional Neural Networks for Sentence Classification
- borrow some code from Structured-Self-Attentive-Sentence-Embedding
- A Structured Self-Attentive Sentence Embedding
- borrow some code from salesforce/cove
- Learned in Translation: Contextualized Word Vectors