-
This repository contains the data and classifier used in the paper titled "Mawqif: A Multi-label Arabic Dataset for Target-specific Stance Detection" accepted to appear at WANLP, EMNLP 2022 . Link to the paper: here
-
Mawqif is the first Arabic dataset that can be used for target-specific stance detection.
-
This is a multi-label dataset where each data point is annotated for stance, sentiment, and sarcasm, which will provide a benchmark for the three tasks.
-
We benchmark Mawqif dataset on the stance detection task and evaluate the performance of four BERT-based models. Our best model achieves a macro-F1 of 78.89%, which shows that there is ample room for improvement on this challenging task.
-
In addition to the annotated tweets, we also release the annotation guidelines, and the code used to build a standard pipeline under the PyTorch Lightning framework to fine-tune BERT-based models for stance detection.
-
Dataset on HuggingFace 🤗: https://huggingface.co/datasets/NoraAlt/Mawqif_Stance-Detection
-
This dataset consists of 4,121 tweets in multi-dialectal Arabic. Each tweet is annotated with a stance toward one of three targets: “COVID-19 vaccine,” “digital transformation,” and “women empowerment.” In addition, it is annotated with sentiment and sarcasm polarities.
-
The following figure illustrates the labels’ distribution across all targets, and the distribution per target.
To browse an interactive visualization of the Mawqif dataset, please click here
- You can click on visualization components to filter the data by target and by class. For example, you can click on “women empowerment" and "against" to get the information of tweets that express against women empowerment.
If you feel our paper and resources are useful, please consider citing our work!
@inproceedings{alturayeif-etal-2022-mawqif,
title = "Mawqif: A Multi-label {A}rabic Dataset for Target-specific Stance Detection",
author = "Alturayeif, Nora Saleh and
Luqman, Hamzah Abdullah and
Ahmed, Moataz Aly Kamaleldin",
booktitle = "Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP)",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates (Hybrid)",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wanlp-1.16",
pages = "174--184",
}