Skip to content

Latest commit

 

History

History
76 lines (57 loc) · 3.54 KB

File metadata and controls

76 lines (57 loc) · 3.54 KB

Summary

UD Korean-LittlePrince is a UD adaptation of the k-SNACS dataset (Hwang et al. 2020).

Introduction

UD Korean-LittlePrince is a UD adaptation of the k-SNACS dataset (Hwang et al. 2020), a Korean version of the wider SNACS effort (Schneider et al. 2018) that annotates case and adposition supersense. Lemmas, POS tags, and dependency relations are supplied by Stanza (Qi et al. 2020) models trained on UD Korean-KAIST (Chun et al. 2018) and manually adjusted to satisfy UD validation.

  • Title: 어린 왕자 (erin wangca) "The Little Prince"
  • Author: Atoine de Saint-Exupéry
  • Original Language: French (Le Petit Prince)
  • Genre: Fiction

Tokens

Acknowledgments

Contributors are as follows:

  • Junghyun Min (Georgetown University)
  • Jena D. Hwang (AI2)
  • Nathan Schneider (Georgetown University)

Project repository: https://github.com/Aatlantise/k-snacs-ud

References

  • Jayeol Chun, Na-Rae Han, Jena D. Hwang, and Jinho D. Choi. 2018. Building Universal Dependency Treebanks in Korean. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
  • Jena D. Hwang, Hanwool Choe, Na-Rae Han, and Nathan Schneider. 2020. K-SNACS: Annotating Korean Adposition Semantics. In Proceedings of the Second International Workshop on Designing Meaning Representations, pages 53–66, Barcelona Spain (online). Association for Computational Linguistics.
  • Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101–108, Online. Association for Computational Linguistics.
  • Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, and Omri Abend. 2018. Comprehensive Supersense Disambiguation of English Prepositions and Possessives. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 185–196, Melbourne, Australia. Association for Computational Linguistics.

Citing this work

When using this data, please cite the following as appropriate:

Original k-SNACS annotations Hwang et al., 2020:

Hwang, Jena D., Hanwool Choe, Na-Rae Han, and Nathan Schneider. "K-SNACS: Annotating Korean adposition semantics." In Proceedings of the Second International Workshop on Designing Meaning Representations. 2020.

Universal Dependencies adaptation Min et al., 2025:

Junghyun Min, Jena D. Hwang, and Nathan Schneider. "UD-Korean-LittlePrince." 2025.

Changelog

  • 2025-05-15 v2.16
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.16
License: CC BY-SA 4.0
Includes text: yes
Parallel: no
Genre: fiction
Lemmas: automatic
UPOS: automatic
XPOS: automatic
Features: automatic
Relations: automatic
Contributors: Min, Junghyun; Hwang, Jena; Schneider, Nathan
Contributing: here
Contact: jm3743@georgetown.edu
===============================================================================