Skip to content
Permalink
Branch: master
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
61 lines (49 sloc) 2.24 KB
# Summary
The UD Kazakh treebank is a combination of text from various sources including Wikipedia, some folk tales,
sentences from the UDHR, news and phrasebook sentences. Sentences IDs include partial document identifiers.
# Introduction
The tokenisation in the Kazakh UD treebank follows the principles of [Turkic lexica in Apertium](http://wiki.apertium.org/wiki/Turkic_lexicon).
Morphological processing in the Kazakh UD treebank follows the principles of [Turkic lexica in Apertium](http://wiki.apertium.org/wiki/Turkic_lexicon).
The treebank was randomly split into training (80%), testing (10%), and development (10%) sets.
# Acknowledgements
Please, cite the following papers if you use Kazakh UD treebank:
@inproceedings{tyers_tl2015,
author = {Tyers, Francis M. and Washington, Jonathan N.},
title = {Towards a Free/Open-source Universal-dependency Treebank for Kazakh},
booktitle = {3rd International Conference on Turkic Languages Processing,
(TurkLang 2015)},
pages = {276--289},
year = {2015},
}
@inproceedings{makazhan_tl2015,
author = {Makazhanov, Aibek and
Sultangazina, Aitolkyn and
Makhambetov, Olzhas and
Yessenbayev, Zhandos},
title = {Syntactic Annotation of Kazakh: Following the Universal Dependencies Guidelines. A report},
booktitle = {3rd International Conference on Turkic Languages Processing,
(TurkLang 2015)},
pages = {338--350},
year = {2015},
}
# Changelog
2018-04-15 v2.2
* Repository renamed from UD_Kazakh to UD_Kazakh-KTB.
2016-11-15 v1.4
* A first feature set has been developped.
* Added 150 more trees annotated for morpho-lexical features (in addition to POS, lemmata, and syntax).
* Several annotation errors have been fixed.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v1.3
License: CC BY-SA 4.0
Includes text: yes
Genre: wiki fiction news
Lemmas: manual native
UPOS: converted from manual
XPOS: manual native
Features: converted from manual
Relations: manual native
Contributors: Makazhanov, Aibek; Washington, Jonathan North; Tyers, Francis
Contributing: elsewhere
Contact: aibek.makazhanov@nu.edu.kz, jonathan.north.washington@gmail.com, ftyers@prompsit.com
===============================================================================
You can’t perform that action at this time.