Contrastive Learning for Neural Topic Model

This repository contains the implementation of the paper Contrastive Learning for Neural Topic Model.

Thong Nguyen, Luu Anh Tuan (NeurIPS 2021)

In this work, we target the problem of capturing meaningful representations through modeling the relations among samples from a mathematical perspective and propose a novel contrastive objective to train the neural topic model, along with the optimization of the variational lower bound. In our contrastive learning framework, we introduce a novel sampling strategy that is motivated by human behavior when comparing numerous documents. Our results show that capturing mutual information between the prototype and its positive sample provides a strong foundation for constructing coherent topics, while differentiating the prototype from the negative samples plays a less fundamental role.

@inproceedings{
nguyen2021contrastive,
title={Contrastive Learning for Neural Topic Model},
author={Thong Thanh Nguyen and Anh Tuan Luu},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=NEgqO9yB7e}
}

Requirements

python3
pandas
gensim
numpy
torchvision
pytorch 1.7.0
scipy

How to Run

Download and put the dataset in the data folder: https://bit.ly/44mUEUv
Train the model by running ./scripts/train_models/run_{dataset}_{topk}.sh
Evaluate the model via executing ./scripts/evaluate/run_{dataset}_npmi_{topk}.sh

Acknowledgement

Our implementation is based on the official code of SCHOLAR.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
asset		asset
scripts		scripts
stopwords		stopwords
weights		weights
README.md		README.md
compute_npmi.py		compute_npmi.py
compute_parameter_npmi.py		compute_parameter_npmi.py
compute_ref_counts.py		compute_ref_counts.py
download_20ng.py		download_20ng.py
download_imdb.py		download_imdb.py
file_handling.py		file_handling.py
import_congress_press.py		import_congress_press.py
preprocess_data.py		preprocess_data.py
preprocess_data_old.py		preprocess_data_old.py
preprocess_gigaword.py		preprocess_gigaword.py
run_scholar.py		run_scholar.py
run_scholar_multiple_tokens.py		run_scholar_multiple_tokens.py
run_scholar_parameter.py		run_scholar_parameter.py
run_scholar_tf.py		run_scholar_tf.py
scholar.py		scholar.py
scholar_tf.py		scholar_tf.py
split_data.py		split_data.py

AdhyaSuman/CLNTM

Folders and files

Latest commit

History

Repository files navigation

Contrastive Learning for Neural Topic Model

Requirements

How to Run

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages