Skip to content

Dataset from 《A Novel Framework for Detecting Cantonese Rumors Using Deep Neural Networks with Feature Fusion》

Notifications You must be signed in to change notification settings

cxyccc/CR-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

CR-Dataset

A Cantonese rumor dataset for the paper 《Identifying Cantonese Rumors with Discriminative Feature Integration in online Social Networks》

We develop a web crawler to collect Cantonese tweets from Twitter and finish the data annotation to construct this dataset. This dataset contains 27,328 tweets, including 13,883 rumors and 13,445 non-rumors.

Collection

CR-Dataset-example.csv: This file provides the source tweets and labels in a format like: 'text, label'.

Example:

Text Label
今日警察已經光明正大喬裝示威者到處破壞而唔需要隱藏!光明正大拿槍對準市民 #929globalantitotalitarianism #hkpolicebrutality https://t.co/OURXJZyfag 1

The time of data collection:

  • February 2020 - April 2020
  • November 2021 - December 2021

Annotation

The reports from three sources are recognized as the bases of facts:

Data annotation is strictly in accordance with reports from the above three sources, and does not involve personal political positions. This dataset is used for academic research only.

Citation

If you find this dataset useful, please cite our paper.

About

Dataset from 《A Novel Framework for Detecting Cantonese Rumors Using Deep Neural Networks with Feature Fusion》

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published