Version: Version 1.0. Updated Sept 2016.
This page was migrated from https://nlds.soe.ucsc.edu/ on Sept 2023.
The Sarcasm Corpus V1 is a subset of the Internet Argument Corpus, including response text from quote-response pairs annotated for sarcasm.
- 997 response texts labeled as not sarcastic (in notsarc/)
- 998 response texts labeled as sarcastic (in sarc/)
If you use this data in your research, please refer to and cite:
Stephanie Lukin and Marilyn Walker. "Really? Well. Apparently Bootstrapping Improves the Performance of Sarcasm and Nastiness Classifiers for Online Dialogue." In The Workshop on Language Analysis in Social Media (LASM), at The Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Atlanta, Georgia, USA, 2013.
- Marilyn A. Walker, Pranav Anand, Jean E. Fox Tree, Rob Abbott, Joseph King. "A Corpus for Research on Deliberation and Debate." In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.
- Raquel Justo, Thomas Corcoran, Stephanie M Lukin, Marilyn Walker, and M Ines Torres. "Extracting Relevant Knowledge for the Detection of Sarcasm and Nastiness in the Social Web." In Knowledge-Based Systems, 69:124–133, 2014.