-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Open
Labels
enhancementFeature requests and improvementsFeature requests and improvementsfeat / lemmatizerFeature: Rule-based and lookup lemmatizationFeature: Rule-based and lookup lemmatizationhelp wanted (easy)Contributions welcome! (also suited for spaCy beginners)Contributions welcome! (also suited for spaCy beginners)lang / deGerman language data and modelsGerman language data and models
Description
How to reproduce the behaviour
import spacy
nlp = spacy.load('de')
s1 = 'Der schöne Garten'
doc = nlp(s1)
[(t, t.lemma_) for t in doc]
>> [(Der, 'der'), (schöne, 'schöne'), (Garten, 'Garten')]
s2 = 'Ein schöner Garten'
doc = nlp(s2)
[(t, t.lemma_) for t in doc]
>> [(Ein, 'Ein'), (schöner, 'schön'), (Garten, 'Garten')]
My Environment
- spaCy version: 2.2.2
- Platform: Linux-5.0.0-25-generic-x86_64-with-LinuxMint-19.2-tina
- Python version: 3.6.7
- Models: de
Reason
As far as I can see, all forms of German adjectives ending on e in spacy-lookups-data/spacy_lookups_data/data/de_lemma_lookup.json are capitalized, e.g.:
"Dekorative": "dekorativ",
"Weiße": "Weiß",
"Schöne": "Schönes",
Metadata
Metadata
Assignees
Labels
enhancementFeature requests and improvementsFeature requests and improvementsfeat / lemmatizerFeature: Rule-based and lookup lemmatizationFeature: Rule-based and lookup lemmatizationhelp wanted (easy)Contributions welcome! (also suited for spaCy beginners)Contributions welcome! (also suited for spaCy beginners)lang / deGerman language data and modelsGerman language data and models