Skip to content

German adjectives ending on -e are not lemmatized using the lookup lemmatizer #4622

@SuzanaK

Description

@SuzanaK

How to reproduce the behaviour

import spacy
nlp = spacy.load('de')
s1 = 'Der schöne Garten'                                                                                                                                                             
doc = nlp(s1)                                                                                                                                                                        
[(t, t.lemma_) for t in doc]                                                                                                                                                         
 >> [(Der, 'der'), (schöne, 'schöne'), (Garten, 'Garten')]

s2 = 'Ein schöner Garten'  
doc = nlp(s2)                                                                                                                                                                        
[(t, t.lemma_) for t in doc]                                                                                                                                                         
>> [(Ein, 'Ein'), (schöner, 'schön'), (Garten, 'Garten')]

My Environment

  • spaCy version: 2.2.2
  • Platform: Linux-5.0.0-25-generic-x86_64-with-LinuxMint-19.2-tina
  • Python version: 3.6.7
  • Models: de

Reason

As far as I can see, all forms of German adjectives ending on e in spacy-lookups-data/spacy_lookups_data/data/de_lemma_lookup.json are capitalized, e.g.:

"Dekorative": "dekorativ",
"Weiße": "Weiß",
"Schöne": "Schönes",

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementFeature requests and improvementsfeat / lemmatizerFeature: Rule-based and lookup lemmatizationhelp wanted (easy)Contributions welcome! (also suited for spaCy beginners)lang / deGerman language data and models

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions