Repository for the ACL 2020 paper:
Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter
Link to the paper: Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter
Please use the following citation:
@inproceedings{conforti2020wtwt,
title={Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter},
author={Conforti, Costanza and Berndt, Jakob and Pilehvar, Mohammad Taher and Giannitsarou, Chryssi and Toxvaerd, Flavio and Collier, Nigel}
booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL2020)},
year={2020}
}
Will-They-Won't-They (WT-WT) is a large dataset of English tweets targeted at stance detection for the rumor verification task. The dataset is constructed based on tweets that discuss five recent merger and acquisition (M&A) operations of US companies, mainly from the healthcare sector.
All the annotations are carried out by domain experts; therefore, the dataset constitutes a high-quality and reliable benchmark for future research in stance detection.
Operation | Buyer | Target | Industry |
---|---|---|---|
CVS_AET | CVS Health | Aetna | Healthcare |
CI_ESRX | Cigna | Express Scripts | Healthcare |
ANTM_CI | Anthem | Cigna | Healthcare |
AET_HUM | Aetna | Humana | Healthcare |
DIS_FOXA | Disney | 21st Century Fox | Entertainment |
- {cc918, jb2088, mp792, cg349, fmot2, nhc30} @cam.ac.uk
- Cambridge Language Technology Lab
- Cambridge Faculty of Economics