Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
89 lines (62 sloc) 3.08 KB
# Summary
This Universal Dependencies (UD) Japanese treebank is based on the definition of
UD Japanese convention described in the UD documentation.
The original sentences are from `Corpus of Historical Japanese' (CHJ).
# Introduction
The Japanese UD treebank contains the sentences from CHJ Meiji Era / Taishō Era
Series I: Magazines - Meiroku Zasshi samples
http://pj.ninjal.ac.jp/corpus_center/chj/meiji_taisho-en.html
with BCCWJ-DepPara[2] compatible annotation [5].
We prepared conversion rules from BCCWJ-DepPara to UD_Japanese v2.1 guidelines [3][4].
## Spliting
The all data in UD_Japanese-Modern is test data.
test: all
## Citation
You are encouraged to cite the following paper when you refer to the
Universal Dependencies Japanese Treebank.
Omura, M., Takahashi, Y., & Asahara, M. (2017).
Universal Dependency for Japanese Modern.
In JADH-2017.
Asahara, M., Kanayama, H., Tanaka, T., Miyao, Y., Uematsu, S., Mori, S.,
Matsumoto, Y., Omura, M., & Murawaki, Y. (2018).
Universal Dependencies Version 2 for Japanese.
In LREC-2018.
# Acknowledgments
This work was supported by JSPS KAKENHI Grants Numbers JP15K12888 and
JP17H00917 and is a project of the Center for Corpus Development, NINJAL.
The original treebank was provided by:
- National Instutite for Japanese Language and Linguistics, Japan
The corpus was converted by:
- Mai Omura
- Masayuki Asahara
through discussion and validation with
- Yuta Takahashi
# License
See file LICENSE.txt
# Reference
[1] National Institute for Japanese Language and Linguistics, Center for Corpus Development (Kondō, Asuko; Mabuchi, Yōko; Hattori, Noriko, et. al.) (eds.) (2017) Corpus of Historical Japanese, Meiji Era / Taishō Era Series I: Magazines (Short Unit Word Data Version 1.1) http://pj.ninjal.ac.jp/corpus_center/chj/meiji_taisho.html (accessed March 27, 2018)
[2] Asahara, M., & Matsumoto, Y. (2016). Bccwj-deppara: A syntactic annotation treebank on the ‘Balanced Corpus of Contemporary Written Japanese’. In Proceedings of the 12th Workshop on Asian Language Resources (ALR12) (pp. 49-58).
[3] Tanaka, T., Miyao, Y., Asahara, M., Uematsu, S., Kanayama, H., Mori, S., &
Matsumoto, Y. (2016). Universal Dependencies for Japanese. In LREC-2016.
[4] Asahara, M., Kanayama, H., Tanaka, T., Miyao, Y., Uematsu, S., Mori, S.,
Matsumoto, Y., Omura, M., & Murawaki, Y. (2018). Universal Dependencies Version 2 for Japanese. In LREC-2018.
[5] Omura, M., Takahashi, Y. & Asahara, M. (2017). Universal Dependency for Japanese Modern, In JADH-2017.
Changelog
2018-11-01 v2.3
* Update v2.2 to v2.3
2018-03-28 v2.2
* Initial release in Universal Dependencies.
=== Machine-readable metadata =================================================
Data available since: UD v2.2
License: CC BY-NC-ND 3.0
Includes text: yes
Genre: nonfiction
Lemmas: converted from manual
UPOS: converted from manual
XPOS: manual native
Features: not available
Relations: converted from manual
Contributors: Omura, Mai; Asahara, Masayuki; Takahashi, Yuta
Contributing: elsewhere
Contact: masayu-a@ninjal.ac.jp
===============================================================================