Skip to content

Latest commit

 

History

History
49 lines (33 loc) · 2.25 KB

README.md

File metadata and controls

49 lines (33 loc) · 2.25 KB

Summary

UD_Russian-Poetry contains samples of Russian poetry written in 19th – early 21th centuries. The treebank is based on the Poetry Corpus of the Russian National Corpus.

Introduction

UD_Russian-Poetry contains samples of Russian poetry written in 19th – early 21th centuries. The treebank is based on the Poetry Corpus of the Russian National Corpus (https://ruscorpora.ru/s/elRGl). Initial annotation according to the RNC/UD-ext morphological schema and UD dependency schema is created using Rubic BERT-based transformer (Lyashevskaya et al. 2023) and manually corrected. Annotations were converted into UD 2.0 format and additionally checked. The treebank contains original versological annotation of the RNC Poetry Corpus on rhyme zones and metrical properties of the verse (see MISC column).

Acknowledgments

We wish to thank all of the contributors to the RNC Poetry Corpus collection and annotation effort, and especially Vladimir Plungian and Kirill Korchagin.

References

  • Grišina, E. A., K. M. Korčagin, V. A. Plungjan, and D. V. Sičinava. Poetičeskij korpus v ramkax Nacional’nogo korpusa russkogo jazyka: obščaja struktura i perspektivy ispol’zovanija. In V. A. Plungjan et al. (eds.), Nacional’nyj korpus russkogo jazyka: 2006–2008. Novye rezul’taty i perspektivy (pp. 71– 113). Sankt Peterburg, 2009.

  • K. M. Korčagin. Začem nužen poetičeskij korpus i kak ego ispol'zovat'. Russkaja rech, 6 (2019). Pp. 113—127.

  • Lyashevskaya, O., I. Afanasev, S. Rebrikov, Ya. Shishkina, E. Suleymanova, I. Trofimov, and N. Vlasova. Disambiguation in context in the Russian National Corpus: 20 yeas later. In Proceedings of the International Conference “Dialogue", vol. 22, 2023.

Changelog

  • 2023-11-15 v2.13
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.13
License: CC BY-SA 4.0
Includes text: yes
Genre: poetry
Lemmas: manual native
UPOS: manual native
XPOS: automatic
Features: manual native
Relations: manual native
Contributors: Lyashevskaya, Olga; Vlasova, Natalia; Sitchinava, Dmitri
Contributing: elsewhere
Contact: olesar@yandex.ru
===============================================================================