Skip to content
Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text (EMNLP2018)
Branch: master
Clone or download
Latest commit 7b1feeb Sep 15, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
review_generation_dataset upload code Feb 5, 2018
README.md
__init__.py upload code Feb 5, 2018
batcher.py upload code Feb 5, 2018
batcher_discriminator.py
data.py upload code Feb 5, 2018
discriminator.py upload code Feb 5, 2018
generated_sample.py
inspect_checkpoint.py
main.py upload dataset Aug 23, 2018
model.py upload code Feb 5, 2018
result_evaluate.py
run.sh update codes Mar 25, 2018
util.py

README.md

DP-GAN

This is the code used in the paper titled DP-GAN: Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text. The link is http://arxiv.org/abs/1802.01345

Requirements

The software is written in tensorflow. It requires the following packages:

python3

Tensorflow 1.3

Prepare the data

python review_generation_dataset/generate_review.py

The sample is shown in review_generation_dataset/train (test). The whole Yelp dataset is avaliable at https://drive.google.com/open?id=1xCt04xWrVhbrSA7T5feV2WSukjmD4SnK

How it works

bash run.sh

The default options can be edited in main.py.

Output Folder Description

"discriminator_train" stores training data for our discriminator. Under this folder, "positive" folder stores the real-data text, and "negative" folder stores the generated text.

"discriminator_test" stores testing data for our discriminator.

"discriminator_result" stores the reward scores calculated by our discriminator at different training steps.

"MLE" stores the text generated by a pre-trained generator on testing set. Under this folder, "MLE_sample_negative" stores the data generated by a sampling mechanism. "MLE_max_temp_negative" stores the data generated by a maximum probability mechanism, which always chooses words with the highest probability. To show what high-quality reviews should be, we also give the real-data text at folder "MLE_sample_positive" and "MLE_sample_positive".

"train_sample_generated" stores the data generated by DP-GAN using a sampling mechanism on training data.

"test_sample_generated" stores the data generated by DP-GAN using a sampling mechanism on testing data.

"test_max_generated" stores the data generated by DP-GAN using a maximum probability mechanism on testing data.

Cite

If you use this code, please cite the following paper:

@inproceedings{dp-gan,

author = {Jingjing Xu, Xu Sun, Xuancheng Ren, Junyang Lin, Binzhen Wei, Wei Li},

title = {DP-GAN: Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text},

journal = {CoRR},

volume = {abs/1802.01345},

year = {2018},

url = http://arxiv.org/abs/1802.01345

}

You can’t perform that action at this time.