Skip to content

Lotemp/SarcasmSIGN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation

Lotem Peled, Roi Reichart (pdf)

Overview

This repository contains the Sarcasm SIGN dataset, a parallel corpus of sarcastic tweets and their non-sarcastic interpretations, as created by human experts. This corpus was created as part of our paper Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation which will be presented in ACL 2017. The repository contains two folders: "corpus" which contains the data files as well as the instructions for our human experts; and "preprocess" which contains code for preprocessing the data and preparing it for a MT system (see ReadMe in preprocess folder).

Characteristics

The Sarcasm SIGN dataset is comprised of 3000 sarcastic tweets (tweets marked with #sarcasm), which are written in English, are not retweets, and do not contain URLs or images. Each sarcastic tweet has five different non sarcastic interpretation. The average sarcastic tweet length is 13.87 words, average interpretation length is 12.10 words and the vocabulary size is 8788 unique words. Following are two examples from our dataset:

Screenshot

Further information regarding the dataset and the instructions given to the human experts can be found in the "corpus" folder.

Future Research

We engourage researchers to send us their algorithms and results, and we will present them here.

Citation

If you use the Sarcasm SIGN dataset and/or algorithm, please cite the following:

Peled, Lotem, and Roi Reichart. "Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation." (ACL 2017).

Contact

For any questions, inquiries or interesting ideas, feel free to contact us.

Lotem: lotemi.peled@gmail.com || https://sites.google.com/view/lotempeled/

Roi: roiri@ie.technion.ac.il || https://ie.technion.ac.il/~roiri/

About

Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published