Skip to content

UniversalDependencies/UD_Korean-Kaist

Repository files navigation

Summary

The KAIST Korean Universal Dependency Treebank is generated by Chun et al., 2018 from the constituency trees in the KAIST Tree-Tagging Corpus.

Acknowledgments

This is a collaborative work by (in alphabetic order):

  • Jinho Choi, Emory University
  • Jayeol Chun, Emory University
  • Na-Rae Han, University of Pittsburgh
  • Jena D. Hwang, Institute for Human & Machine Cognition.

The project repository: https://github.com/emorynlp/ud-korean

Citation

  • Building Universal Dependency Treebanks in Korean, Jayeol Chun, Na-Rae Han, Jena D. Hwang, and Jinho D. Choi. In Proceedings of the 11th International Conference on Language Resources and Evaluation, LREC'18, Miyazaki, Japan, 2018.

Changelog

  • 2022-11-15 v2.11
    • Fixed right-headed apposition and non-projective punctuation.
    • Symbols after numbers are units, not punctuation.
    • Nouns cannot be attached as mark.
    • Fixed: adverbially used nominals are obl, not advmod.
    • Fixed: adverbially used verbs are advcl, not advmod.
    • Positive copula is always lemmatized 이 so that the validator recognizes it.
    • Fixed auxiliaries.
    • Fixed: function words should be leaves.
  • 2018-04-15 v2.2
    • First release by Chun et al., 2018.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.2
License: CC BY-SA 4.0
Includes text: yes
Genre: news fiction academic
Lemmas: converted from manual
UPOS: converted from manual
XPOS: converted from manual
Features: not available
Relations: converted from manual
Contributors: Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol
Contributing: here
Contact: jinho.choi@emory.edu
===============================================================================
(Original treebank contributors: Choi, Key-Sun)