Skip to content

vadno/emmorph2conll

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

emmorph2conll

The script converts the output tag of emMorph morphological analyzer to the corresponding tag of a version Szeged Treebank.

What's in this repo?

  • the main script of the converter: converter.py
  • auxiliary files in folder converterdata
  • license
  • this readme

The tagsets 🇭🇺

A detailed description of the tagsets is available here.

emMorph

emMorph is the current morphological analyzer for Hungarian and it is integrated into the e-magyar language processing toolchain. The list of emMorph tags is from here.

CoNLL

What we call here CoNLL is a modified version of the morphosyntactic tagset of MULTEXT transformed into a feature-value pair structure. This modified tagset is an annotation scheme for a version of the largest fully manually annotated corpus of Hungarian, Szeged Treebank.

How to use the converter?

  • standard input: token, lemma, emmorph tag separated by tab
  • standard output: conll tag

Dependencies

Python3

License

GNU General Public License v3.0

Our converters

About

morphology converter from emmorph to conll

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages