This repository contains the dataset used in our paper: Investigating Memorization of Conspiracy Theories in Text Generation.
When using our dataset, please cite our paper:
@article{conspiracymem,
author = {Sharon Levy and
Michael Saxon and
William Yang Wang},
title = {Investigating Memorization of Conspiracy Theories in Text Generation},
journal = {CoRR},
volume = {abs/2101.00379},
year = {2021},
url = {https://arxiv.org/abs/2101.00379},
}
This dataset consists of popular conspiracy theory topics from Wikipedia's category list in Wikipedia_topics.txt
. Our process in selecting these topics is described in our paper.
The other three files consist of conspiracy theories generated by GPT-2 Large with the text prompt "The conspiracy theory is that". Each file represents the temperature setting used when generating the text (0.4, 0.7, 1).