Skip to content

UniversalDependencies/UD_Gothic-PROIEL

Repository files navigation

Summary

The UD Gothic treebank is based on the Gothic data from the PROIEL treebank, and consists of Wulfila's Bible translation.

Introduction

The UD Gothic treebank is based on the Gothic data in the PROIEL treebank, which is maintained at the Department of Philosophy, Classics, History of Arts and Ideas at the University of Oslo. The conversion is based on the 20180408 release of the PROIEL treebank available from https://github.com/proiel/proiel-treebank/releases. The original annotators are acknowledged in the files available there. The conversion code is available in the Rubygem proiel-cli and released as part of the PROIEL command-line interface.

The treebank contains the text of Wulfila's Bible (New Testament) translation. The original annotation guidelines are available at http://folk.uio.no/daghaug/syntactic_guidelines.pdf. The text and tokenization comes from the Wulfila project.

Acknowledgements

The data have been automatically converted to the UD scheme by Dag Haug. Thanks to all the original annotators!

Data splits

The development set consists of Matthew 5 and 6, Mark 5 and 6, Luke 4, 7, 8 and 18, John 10, 11, 17 and 19, Romans 7 and 8, 1 Corinthians 10, 2 Corinthians 5, Galatians 1, Ephesians 1, 1 Timothy 2, Philemon 1 and Colossians 3. The test data consists of Matthew 7, 8, Mark 7, 8, Mark 15, Luke 5, 9, 10, 19, John 12, 13, 18, Romans 10, 11, 1 Corinthians 11, 2 Corinthians 8, Galatians 2, Ephesians 2, 1 Timothy 3, Philemon 2, Colossians 2.

References

Dag T. T. Haug and Marius L. Jøhndal. 2008. 'Creating a Parallel Treebank of the Old Indo-European Bible Translations'. In Caroline Sporleder and Kiril Ribarov (eds.). Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008) (2008), pp. 27-34.

Changelog

  • 2022-05-15 UD 2.10
    • MISC ref= changed to Ref= to match other UD treebanks.

V2.2 Repository renamed from UD_Gothic to UD_Gothic-PROIEL. V2.0 The treebank was converted to UDv2 and the data splits were changed.

=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v1.2 License: CC BY-NC-SA 3.0 Includes text: yes Genre: bible Lemmas: converted from manual UPOS: converted from manual XPOS: manual native Features: converted from manual Relations: converted from manual Contributors: Haug, Dag Contributing: elsewhere Contact: daghaug@ifikk.uio.no

About

Gothic data from the PROIEL project.

Resources

License

Stars

Watchers

Forks

Packages

No packages published