Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
46 lines (31 sloc) 2.02 KB

Summary

UD EWT treebank is a converted version of the Estonian Web Treebank (EWT), originally annotated in the Constraint Grammar (CG) annotation scheme, and consisting of different genres of new media. The treebank contains 1,662 trees, 27,246 tokens.

Introduction

The Estonian Web Treebank UD v2.4 is based on the subpart of Estonian Dependency Treebank (EDT), created at the University of Tartu. The treebank has been automatically converted and then manually reviewed and reannotated. The treebank covers unedited new media texts.

Acknowledgments

We wish to thank developers of udapi and ud annotatrix tools.

This work was financed by the National Programme for Estonian Language Technology and Estonian Ministery of Education and Research (grant 20-56 IUT20-56 "Computational models for Estonian").

References

  • Kadri Muischnek, Kaili Müürisep, Tiina Puolakainen, Eleri Aedmaa, Riin Kirt, Dage Särg. 2014. Estonian Dependency Treebank and its annotation scheme. In: Proceedings of the 13th Workshop on Treebanks and Linguistic Theories (TLT13), pp. 285–291, ISBN 978-3-9809183-9-8, Tübingen, Germany.

Changelog

  • UD v2.4: automatic conversion from CG, manual reannotation.
=== Machine-readable metadata =================================================
Documentation status: stub
Data source: semi-automatic
Data available since: UD v2.4
License: CC BY-NC-SA 4.0
Includes text: yes
Genre: blog web social
Lemmas: converted from manual
UPOS: converted from manual
XPOS: converted from manual
Features: converted from manual
Relations: converted from manual
Contributing: here
Contributors: Muischnek, Kadri; Müürisep, Kaili; Puolakainen, Tiina; Särg, Dage
Contact: kadri.muischnek@ut.ee, kaili.muurisep@ut.ee
===============================================================================
You can’t perform that action at this time.