Skip to content

MickeysClubhouse/COVID-19-rumor-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

File Structure

├── Data
│   ├── en_dup.csv
│	└── news
│	└── twitter
├── Data Analysis
│   └── PowerLaw Analysis.pdf
├── Data Collecting
│   └── snopes.py
├── LICENSE
└── README.md

Data Collecting

  • snopes.py by Tianqi
    • It is used to collect data from website www.snopes.com and qc.wa.news.cn (departed)

Data Analysis

Data

  • news

    • news.csv (4129) and subfolder of each news
    • The number of subfolder records: 3936
  • twitter

    • Twitter.csv (2705) and subfolder of each twitter
    • The number of subfolder records: 1383
  • en_dup.csv

    • Unprocessed data with both news and twitter records.
    • The number of records: 7179 (with duplication).
    • Part of data are collected manually by keywords searching from sources such as twitter.com.
    • Data from www.snopes.com and qc.wa.news.cn are collected by 'snopes.py'.

Acknowledgement

  • We thank Tianqi, Wenshuo, Jianni, Xiaofeng, and Hanlong for rumor data collection and labeling.

Cite Us

Cheng, Mingxi, et al. "A COVID-19 Rumor Dataset." Frontiers in Psychology 12 (2021): 1566.
@article{cheng2021covid,
title={A COVID-19 Rumor Dataset},
author={Cheng, Mingxi and Wang, Songli and Yan, Xiaofeng and Yang, Tianqi and Wang, Wenshuo and Huang, Zehao and Xiao, Xiongye and Nazarian, Shahin and Bogdan, Paul},
journal={Frontiers in Psychology},
volume={12},
pages={1566},
year={2021},
publisher={Frontiers}. }.
Link to paper: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.644801/full

Releases

No releases published

Packages

No packages published

Languages