Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
42 lines (33 sloc) 1.87 KB

Summary

The UD Turkish Treebank, also called the IMST-UD Treebank, is a semi-automatic conversion of the IMST Treebank (Sulubacak et al., 2016).

Introduction

The UD Turkish Treebank, also called the IMST-UD Treebank, is a semi-automatic conversion of the IMST Treebank (Sulubacak et al., 2016), which is itself a reannotated version of the METU-Sabancı Turkish Treebank (Oflazer et al., 2003). All three of the treebanks share the same raw data, a set of 5 635 sentences collected from daily news reports and novels.

Acknowledgments

This treebank follows a set of morphosyntactic annotation guidelines based on those established by Çağrı Çöltekin, and later revised and restructured by Memduh Gökırmak, Francis Tyers, and Umut Sulubacak. The conversion from the IMST Treebank was done by Umut Sulubacak. The contributors would also like to thank Birsel Karakoç, Hüner Kaşıkara, and Tuğba Pamay for their discussions and insights.

Changelog

  • UD 2.4
    • Moved around a few sentences so that both dev and test have over 10K words again.
  • UD 2.2
    • Repository renamed from UD_Turkish to UD_Turkish-IMST.
  • UD 2.1
    • No change.
  • UD 2.0
    • Conversion to UD v2 guidelines.
  • UD 1.4
    • Fixed annotation and spelling mistakes in generated forms of multiword tokens.
  • UD 1.3
    • First release in UD.

=== Machine-readable metadata ================================================= Data available since: UD v1.3 License: CC BY-NC-SA Includes text: yes Genre: nonfiction news Lemmas: converted from manual UPOS: converted from manual XPOS: manual native Features: converted from manual Relations: converted from manual Contributors: Çöltekin, Çağrı; Cebiroğlu Eryiğit, Gülşen; Gökırmak, Memduh; Kaşıkara, Hüner; Sulubacak, Umut; Tyers, Francis Contributing: elsewhere Contact: memduhg@gmail.com

You can’t perform that action at this time.