Skip to content

IWNLP.Lemmatizer is a dictionary-based lemmatizer for the German language

License

Notifications You must be signed in to change notification settings

Liebeck/IWNLP.Lemmatizer

Repository files navigation

IWNLP.Lemmatizer

license
IWNLP is a dictionary-based lemmatizer for the German language. It is based on the German edition of Wiktionary. IWNLP consists of two parts:

  • IWNLP: A parser for the German edition of Wiktionary
  • IWNLP.Lemmatizer: A German lemmatizer that uses the output from IWNLP to produce a mapping from an inflected form to a lemma.

More details can be found at www.iwnlp.com
We also provide a Python implementation for the lemmatizer: IWNLP-py

How to run IWNLP.Lemmatizer

  • Make sure that you followed the steps from IWNLP regarding the creation of a parsed XML Wiktionary file.
  • Clone IWNLP.Lemmatizer and build it
  • Start IWNLP.Lemmatizer.exe with three parameters: Path to parsed Wiktionary dump, path to the XML export file, path to the JSON export file. For instance
IWNLP.Lemmatizer.exe "c:\\parsedIWNLP_latest.xml" "c:\\IWNLP.Lemmatizer_latest.xml" "c:\\IWNLP.Lemmatizer_latest.json"

Citation

Please include the following BibTeX if you use IWNLP in your work:

@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
  author    = {Liebeck, Matthias  and  Conrad, Stefan},
  title     = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
  booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
  year      = {2015},
  publisher = {Association for Computational Linguistics},
  pages     = {414--418},
  url       = {http://www.aclweb.org/anthology/P15-2068}
}

About

IWNLP.Lemmatizer is a dictionary-based lemmatizer for the German language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages