Skip to content

ZotovaElena/Multilingual-Stance-Detection

Repository files navigation

Multilingual-Stance-Detection

Catalonia Independence Corpus

  • Two versions of datasets in Spanish (CIC-ES and CIC-Random-ES) and Catalan (CIC-CA and CIC-Ramndom-CA) that consists of annotated Twitter messages for automatic stance detection. The data was collected during 12 days in February and March of 2019 posted in Barcelona, and during September of 2018 posted in the town of Terrassa, Catalonia. The corpus is annotated with three classes: AGAINST, FAVOR and NONE, which express stance towards the target -- the independence of Catalonia. Each dataset is splitted into train, validation and test sets in relation 60/20/20.

  • LM Models trained on:

    • IberEval 2018 dataset
    • SemEval Task 6A dataset
    • CIC Corpus

Cite

@inproceedings{zotova-etal-2020-multilingual, title = "Multilingual Stance Detection in Tweets: The {C}atalonia Independence Corpus", author = "Zotova, Elena and Agerri, Rodrigo and Nu{~n}ez, Manuel and Rigau, German", booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference", month = may, year = "2020", address = "Marseille, France", publisher = "European Language Resources Association", url = "https://www.aclweb.org/anthology/2020.lrec-1.171", pages = "1368--1375", }

About

Catalonia Independence Corpus and ML models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published