Mawqif: A Multi-label Arabic Dataset for Target-specific Stance Detection

This repository contains the data and classifier used in the paper titled "Mawqif: A Multi-label Arabic Dataset for Target-specific Stance Detection" accepted to appear at WANLP, EMNLP 2022 . Link to the paper: here
Mawqif is the first Arabic dataset that can be used for target-specific stance detection.
This is a multi-label dataset where each data point is annotated for stance, sentiment, and sarcasm, which will provide a benchmark for the three tasks.
We benchmark Mawqif dataset on the stance detection task and evaluate the performance of four BERT-based models. Our best model achieves a macro-F1 of 78.89%, which shows that there is ample room for improvement on this challenging task.
In addition to the annotated tweets, we also release the annotation guidelines, and the code used to build a standard pipeline under the PyTorch Lightning framework to fine-tune BERT-based models for stance detection.
Dataset on HuggingFace 🤗: https://huggingface.co/datasets/NoraAlt/Mawqif_Stance-Detection

Mawqif Statistics

This dataset consists of 4,121 tweets in multi-dialectal Arabic. Each tweet is annotated with a stance toward one of three targets: “COVID-19 vaccine,” “digital transformation,” and “women empowerment.” In addition, it is annotated with sentiment and sarcasm polarities.
The following figure illustrates the labels’ distribution across all targets, and the distribution per target.

Interactive Visualization

To browse an interactive visualization of the Mawqif dataset, please click here

You can click on visualization components to filter the data by target and by class. For example, you can click on “women empowerment" and "against" to get the information of tweets that express against women empowerment.

Citation

If you feel our paper and resources are useful, please consider citing our work!

@inproceedings{alturayeif-etal-2022-mawqif,
    title = "Mawqif: A Multi-label {A}rabic Dataset for Target-specific Stance Detection",
    author = "Alturayeif, Nora Saleh  and
      Luqman, Hamzah Abdullah  and
      Ahmed, Moataz Aly Kamaleldin",
    booktitle = "Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP)",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates (Hybrid)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.wanlp-1.16",
    pages = "174--184",
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Data		Data
Annotations_guidelines.md		Annotations_guidelines.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mawqif: A Multi-label Arabic Dataset for Target-specific Stance Detection

Mawqif Statistics

Interactive Visualization

Citation

About

Releases

Packages

Contributors 2

NoraAlt/Mawqif-Arabic-Stance

Folders and files

Latest commit

History

Repository files navigation

Mawqif: A Multi-label Arabic Dataset for Target-specific Stance Detection

Mawqif Statistics

Interactive Visualization

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages