UD_Old_Russian-TOROT is a conversion of a selection of the Old East Slavonic and Middle Russian data in the Tromsø Old Russian and OCS Treebank (TOROT), which was originally annotated in PROIEL dependency format.
UD-Old-Russian-TOROT is a conversion of a selection of the Old East Slavonic and Middle Russian data in the Tromsø Old Russian and OCS Treebank (TOROT), which is maintained at UiT The Arctic University of Norway. The treebank is manually annotated, with some automatic preprocessing, on PROIEL dependency format. New texts are still being added. The treebank contains texts from a variety of mediaeval and early modern genres, such as chronicles, legal documents, lives of saints and correspondence. Treebank releases are available from https://github.com/torottreebank/treebank-releases. This conversion is based on the 20190505 release.
The test set consists of Afanasij Nikitin 5 and 19; birchbark letters 497, 502 and 902; Russkaja pravda 1-6; Novgorod First Chronicle (Synodal ms.), entries for 6642-6646; Primary Chronicle (Laurentian ms.), Introduction, the entry for 6463 and the Instruction of Vladimir Monomakh; Suzdal Chronicle (Laurentian ms.), entry for 6657; Zadonshchina; Tale of the Fall of Constantinople, chapters 1-2; Uspenskij sbornik, Life of Feodosij Pečerskij 25-26.
The development set consists of Afanasij Nikitin 6 and 20; birchbark letters 644 and 682; Russkaja pravda 8-9; Novgorod First Chronicle (Synodal ms.), entries for 6675-6681 and 6717; Primary Chronicle (Laurentian ms.), the two entries for 6453 and the entry for 6494; Suzdal Chronicle (Laurentian ms.), entries for 6659 and 662; The Tale of Dracula; Tale of the Fall of Constantinople, chapter 3; Uspenskij sbornik, Life of Feodosij Pečerskij 27-28.
The training set consists of larger, continuous excerpts from Afanasij Nikitin, Russkaja pravda, Novgorod First Chronicle (Synodal ms.), Primary Chronicle (Laurentian ms.), Suzdal Chronicle (Laurentian ms.), Tale of the Fall of Constantinople, Uspenskij sbornik, as well as a further selection of birchbark letters and The Tale of Igor's campaign.
The conversion was performed using a script written by Dag Haug and modified by Hanne Eckhoff. The texts were annotated and reviewed by a brilliant team of annotators, who are acknowledged in the original treebank files and who deserve to be thanked profusely.
Hanne Martine Eckhoff and Aleksandrs Berdičevskis. 2015. 'Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank'. Scripta & e-scripta 14–15, pp. 9-25.
=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.4 License: CC BY-NC-SA 3.0 Includes text: yes Genre: nonfiction legal Lemmas: converted from manual UPOS: converted from manual XPOS: manual native Features: converted from manual Relations: converted from manual Contributors: Eckhoff, Hanne Contributing: elsewhere Contact: firstname.lastname@example.org ===============================================================================