Skip to content

dpgmedia/partisan-news2019

Repository files navigation

DpgMedia2019: A Dutch News Dataset for Partisanship Detection

Citation

If you use this dataset, please cite the following paper:

@misc{1908.02322,
  Author = {Chia-Lun Yeh and Babak Loni and Mariëlle Hendriks and Henrike Reinhardt and Anne Schuth},
  Title = {DpgMedia2019: A Dutch News Dataset for Partisanship Detection},
  Year = {2019},
  Eprint = {arXiv:1908.02322},
}

Description

This dataset consists of two levels of annotation of partisanship of Dutch news articles from DPG Media. Detailed collection and annotation process can be found at URL

Short description of the files:

The first part was annotated on the publisher-level:

The second part was annotated on the article-level:

  • dpgMedia2019-articles-byarticle.jsonl: news articles (id, title, text, published_at)
  • dpgMedia2019-labels-byarticle.jsonl: labels derived from survey (id, partisan, publisher)

In addition, the raw survey data

  • annotations.csv:
    • RespondentId: unique to an annotator
    • ArticleId: unique to an article but only used internally in the survey
    • ExternalId: unique to an article and can be mapped to the id in jsonl files. Some articles have externalId marked 0, meaning that we could not find the article in the end.
    • Partisanship: please refer to the paper for detail
    • Polarity of partisanship: please refer to the paper for detail
    • Pro-, anti-: please refer to the paper for detail
  • annotatorData.csv:
    • RespondentId: unique to an annotator
    • Gender
    • Political_standpoint: answer to the question "Hoe zou u uw eigen politieke standpunt bepalen?"
    • Vote_for_party: answer to the question "Stel dat u binnenkort gaat stemmen, op welke partij gaat u dan stemmen?"

Finally,

  • dpgMedia2019-articles-byarticle-all.jsonl contains all articles that appeared in the survey so that if you want to compute new labels you have the articles

About

Dataset for partisan news detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published