Skip to content

A plugin for the GATE language technology framework for finding lemmata of words.

License

Notifications You must be signed in to change notification settings

GateNLP/gateplugin-dict-lemmatizer

Repository files navigation

gateplugin-Lemmatizer

A plugin for the GATE language technology framework for finding Lemmata for words.

This plugin combines word lists from Wiktionary and, if available, morphological transducers created for the Helsinki Finite-State Transducer (FST) software to find lemmata for tokens.

Currently, the following languages are supported:

  • en (English)
  • de (German)
  • fr (French)
  • it (Italian)
  • nl (Dutch)
  • es (Spanish)

The input for the PR must already be tokenised and every token must have a universal dependency POS tag as a feature.

This plugin is partly based on the code developed by Ahmet Aker for POS tagging and lemmatization in several languages.

About

A plugin for the GATE language technology framework for finding lemmata of words.

Resources

License

Stars

Watchers

Forks

Packages