IWNLP.Lemmatizer is a dictionary-based lemmatizer for the German language
C#
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
IWNLP.Lemmatizer.Converter
IWNLP.Lemmatizer.Evaluation
IWNLP.Lemmatizer.Models
IWNLP.Lemmatizer.Predictor
IWNLP.Lemmatizer
packages
.gitignore
IWNLP.Lemmatizer.sln
LICENSE.md
README.md

README.md

IWNLP.Lemmatizer

license
IWNLP is a dictionary-based lemmatizer for the German language. It is based on the German edition of Wiktionary. IWNLP consists of two parts:

  • IWNLP: A parser for the German edition of Wiktionary
  • IWNLP.Lemmatizer: A German lemmatizer that uses the output from IWNLP to produce a mapping from an inflected form to a lemma.

More details can be found at www.iwnlp.com
We also provide a Python implementation for the lemmatizer: IWNLP-py

How to run IWNLP.Lemmatizer

  • Make sure that you followed the steps from IWNLP regarding the creation of a parsed XML Wiktionary file.
  • Clone IWNLP.Lemmatizer and build it
  • Start IWNLP.Lemmatizer.exe with two parameters: Path to parsed Wiktionary dump, path to the export file. For instance
IWNLP.Parser.exe "c:\\parsedIWNLP_latest.xml" "c:\\IWNLP.Lemmatizer_latest.xml"

Citation

Please include the following BibTeX if you use IWNLP in your work:

@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
  author    = {Liebeck, Matthias  and  Conrad, Stefan},
  title     = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
  booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
  year      = {2015},
  publisher = {Association for Computational Linguistics},
  pages     = {414--418},
  url       = {http://www.aclweb.org/anthology/P15-2068}
}