Skip to content

Latest commit

 

History

History
47 lines (32 loc) · 2.1 KB

README.md

File metadata and controls

47 lines (32 loc) · 2.1 KB

Summary

UD Gheg Pear Stories (GPS) contains renarrations of Wallace Chafe's Pear Stories video (pearstories.org) by heritage speakers of Gheg Albanian living in Switzerland and speakers from Prishtina.

Introduction

UD Gheg GPS contains 966 sentences from 64 recordings of Gheg speakers re-narrating the Pear Stories video. Data collection was part of a bigger project that took place from May 2019 to July 2022 in Zurich, Prishtina and Munich. Only recordings from Prishtina und Zurich were included in the treebank. Speakers of three different generations were interviewed, age ranging from 10 to 67. Sentence ids contain information on location (P for Prishtina, Z for Zurich), Generation (G1, G2, G3) and a unique speaker id, all separated by hyphens, followed by an underscore and the sentence id, which starts at 1 for each interview. Due to the multilingual setting, the treebank contains many instances of code-switching (mostly Swiss-German). It also exhibits characteristics of (semi-)spontaneous speech, like disfluencies and corrections.

The treebank contains 16k tokens and was not split into training and test set.

Acknowledgments

  • Artan Islamaj, Adrian Kuqi: Annotation
  • Christian Ebert: Treebank construction, validation and annotation supervision
  • Barbara Sonnenhauser, Paul Widmer: Project supervision
  • Magdalena Plamada: Technical support

The project was funded by the SNSF grant No. 100015L_182126/1.

References

  • (citation)

Changelog

  • 2022-11-15 v2.11
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.11
License: CC BY-SA 4.0
Includes text: yes
Genre: spoken
Lemmas: manual native
UPOS: manual native
XPOS: not available
Features: manual native
Relations: manual native
Contributors: Ebert, Christian; Islamaj, Artan; Kuqi, Adrian; Sonnenhauser, Barbara; Widmer, Paul; Plamada, Magdalena
Contributing: here
Contact: christiangeorg.ebert@uzh.ch, barbara.sonnenhauser@uzh.ch, paul.widmer@uzh.ch
===============================================================================