Data set name: PolStance
Citation (if available): Lehmann, R., & Derczynski, L. (2019). Political Stance in Danish. In Proceedings of the 22nd Nordic Conference on Computational Linguistics (pp. 197-207).
Data set developer(s): Rasmus Lehmann
Data statement author(s): Leon Derczynski
Others who contributed to this document:
This dataset contains quotes by Danish politicians curated to capture their opinions on various political issues. The goal is to build data-driven systems for automatic analysis of political sentiment.
- BCP-47 language tag: da-DK
- Language variety description: Standard Danish
- Description: Danish politicians sitting in parliament (Folketinget)
- Age: 25-70
- Gender: Mixed. 52 males and 38 females (58%/42%)
- Race/ethnicity (according to locally appropriate categories): Mixed, mostly white with Scandinavian background.
- First language(s): Danish
- Socioeconomic status: Privileged; minimum 56494.17DKK per month ($8470 USD).
- Number of different speakers represented: 63
- Presence of disordered speech: Quotes are mostly curated, so not prevalent.
- Description: One L1 speaker annotator, supervised by one L2 speaker
- Age: 25-35
- Gender: Male
- Race/ethnicity (according to locally appropriate categories): White northern European
- First language(s): primary annotator L1 da-DK
- Training in linguistics/other relevant discipline: masters' student in NLP; bachelor in Communications
- Description: Public statements made by politicians in the Danish parliament during debate or discussion, during verbal interviews or in writing, transcribed and then published in edited newswire
- Time and place: 2018
- Place: Denmark
- Modality (spoken/signed, written): Spoken
- Scripted/edited vs. spontaneous: Mixture
- Synchronous vs. asynchronous interaction: Mixture
- Intended audience: Danish voters
Politicians talking about politics and policy while doing their job, addressing the public.
Should be verbatim quotes, but may have been through a few pairs of hands.
Originally taken from Ritzau; quotes are short enough to constitute "reasonable use" (see Ophavsret).
Based on the worksheets distributed at the 2020 LREC workshop on Data Statements, by Emily M. Bender, Batya Friedman, and Angelina McMillan-Major. Adapted to Markdown by Leon Dercyznski.