Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

This repository contains the code and poster for the NeurIPS 2019 paper Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods.

If you find our work useful for your research, please consider citing:

@article{kernels2rnns,
  title={{Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods}},
  author={Liang, Kevin J and Wang, Guoyin and Li, Yitong and Henao, Ricardo and Carin, Lawrence},
  journal={Advances in Neural Information Processing Systems},
  year={2019}
}

Document Classification

This repository contains a reimplementation of the code used for document classification. These changes make the code clearer, but may also result in slightly different results than those reported in the paper.

Pre-requisites

Software: The document classification code requires Python 3 and TensorFlow 1.[x] (much of the development was done with TF 1.9). See here for example installation instructions.

Hardware: While not technically required, you'll probably want to use a CUDA-enabled GPU. We used an NVIDIA Titan X for our experiments.

Datasets

We consider the following datasets: AGnews, DBPedia, Yahoo!, and Yelp Full. For convenience, we provide pre-processed versions of all datasets. Data are prepared in pickle format. Each .p file has the same fields in same order: train text, val text, test text, train label, val label, test label, dictionary and reverse dictionary.

Datasets can be downloaded here. Place the downloaded data in a directory named [$ROOT]/Data/. Each dataset has two files: the tokenized data and the corresponding pre-trained Glove embeddings.

Training a Classifier

We provide an example script that trains a classifier on each of the 4 datasets considered in the paper. For example,

    bash run_model.sh 0 rkm_lstm 1

will train a 1-gram RKM-LSTM classifier on AGnews, DBPedia, Yahoo!, and Yelp Full, using the first CUDA-visible GPU. The full list of flags can be found in main.py.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Poster		Poster
.gitignore		.gitignore
README.md		README.md
aux_functions.py		aux_functions.py
kernel_cells.py		kernel_cells.py
main.py		main.py
models.py		models.py
preprocess.py		preprocess.py
run_model.sh		run_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

Document Classification

Pre-requisites

Datasets

Training a Classifier

About

Releases

Packages

Languages

kevinjliang/kernels2rnns

Folders and files

Latest commit

History

Repository files navigation

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

Document Classification

Pre-requisites

Datasets

Training a Classifier

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages