Lemma exception tables not reflected in lookup or rule-based lemmatizers

Issues related to the implementation of lemmatizer exceptions have appeared on the spaCy Github since at least 2016. From what I can tell, none of them address explicitly the fact that even exceptions from the [default lookup tables](https://github.com/explosion/spacy-lookups-data/blob/master/spacy_lookups_data/data/en_lemma_exc.json) (or their predecessor data structures) do not seem to be consistently applied by a given lemmatizer. If a similar question _has_ been addressed before, then perhaps the new lemmatizer-as-pipe arrangement merits a renewed explanation.

Is the below expected behavior? If so, then how does one ensure that a given lemmatizer uses the exceptions specified in the `lemma_exc` lookup table? 

## How to reproduce the behaviour

After extracting the lemmatizer pipe, we can check in the associated lookup tables that _wunderkinder_ (noun) is keyed to the lemma _wunderkind_ and _forbore_ (verb) to the lemma _forbear_. But even when the POS is tagged correctly, they are not lemmatized by the pipeline.

```
>>> import spacy
>>> nlp = spacy.load("en_core_web_lg")
>>> lemmatizer = nlp.get_pipe("lemmatizer")
>>> lemmatizer.lookups.get_table("lemma_exc")["noun"]["wunderkinder"]
['wunderkind']
>>> lemmatizer.lookups.get_table("lemma_exc")["verb"]["forbore"]
['forbear']
>>> doc_1 = nlp("I've known several wunderkinder in my life.")
>>> doc_2 = nlp("He never forbore a smile.")
>>> (doc_1[4].lemma_, doc_1[4].pos_)
('wunderkinder', 'NOUN')
>>> (doc_2[2].lemma_, doc_2[2].pos_)
('forbore', 'VERB')
```

Furthermore, neither the rule-based nor the lookup-based lemmatizer takes the the table entries for _wunderkinder_ and _forbore_ into account. 

```
>>> lemmatizer.lookup_lemmatize(doc_1[4])
['wunderkinder']
>>> lemmatizer.lookup_lemmatize(doc_2[2])
['forbore']
>>> lemmatizer.rule_lemmatize(doc_1[4])
['wunderkinder']
>>> lemmatizer.rule_lemmatize(doc_2[2])
['forbore']
```

The above examples are just for illustration purposes. I am asking this question because, in order to rig up a custom lemmatization pipe for my applications, I would like to understand how and when lemma exceptions are applied.

## Your Environment

- **spaCy version:** 3.0.1
- **Platform:** macOS-10.15.7-x86_64-i386-64bit
- **Python version:** 3.8.0
- **Pipelines:** en_core_web_lg (3.0.0), en_core_web_sm (3.0.0)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lemma exception tables not reflected in lookup or rule-based lemmatizers #6980

How to reproduce the behaviour

Your Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Lemma exception tables not reflected in lookup or rule-based lemmatizers #6980

Description

How to reproduce the behaviour

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions