PodSarc is a large-scale bimodal sarcasm dataset consisting of podcast speech segments paired with transcripts and sarcasm annotations, designed to facilitate research in speech-based sarcasm detection.
Sarcasm plays a crucial role in human communication by conveying meanings that contradict literal interpretations. Detecting sarcasm in speech remains challenging due to the scarcity of annotated datasets and the complexity of prosodic and contextual cues.
PodSarc addresses this challenge by providing a large-scale speech dataset specifically designed for sarcasm detection in audio-only environments, such as podcasts, radio broadcasts, and conversational AI systems.
The dataset is collected from the Overly Sarcastic Podcast (OSPod) and annotated using a hybrid pipeline combining LLM-based annotation and human verification.
- ποΈ 29.42 hours of speech data
- π 11,024 utterances
- π Bimodal annotations (Speech + Text)
- π₯ 8 speakers
- β Human-verified annotations
- Sarcasm detection in speech
- Multimodal sarcasm detection
- Prosody and pragmatic meaning
- Conversational AI
- Speech understanding
| Property | Value |
|---|---|
| Total utterances | 11,024 |
| Sarcastic | 4,026 (36.5%) |
| Non-sarcastic | 6,998 (63.5%) |
| Total duration | 29.42 hours |
| Avg. utterance duration | 9.61 seconds |
| Avg. transcript length | 31.18 words |
| Number of speakers | 8 |
{
"text": "They are citizens. Why do you think the population is so big?",
"gpt4o_sarcasm": true,
"gpt4o_emotion": "sarcasm",
"comment": "The speaker sarcastically refers to pigeons as 'citizens' to humorously imply they contribute to the city population, enhancing the exaggerated tone.",
"index": 355,
"nid": "66_355",
"llama3_sarcasm": false,
"human_check": "sarcasm"
}You can find the full dataset here.
If you use PodSarc in your research, please cite:
@inproceedings{li2025leveraging,
title={Leveraging Large Language Models for Sarcastic Speech Annotation in Sarcasm Detection},
author={Li, Zhu and Zhang, Yuqing and Gao, Xiyuan and Nayak, Shekhar and Coler, Matt},
booktitle={Proc. Interspeech 2025},
pages={3973--3977},
year={2025}
}This dataset is released under the CC BY-NC 4.0 License.
- Permission Status
- Academic research β Allowed
- Modification & distribution β Allowed
- Commercial use β Not allowed