Add support for replacing listeners #7

shadeMe · 2023-07-04T14:49:48Z

Description

Multiple changes were required to facilitate this:

All transformer model entrypoints now have a wrapped_listener optional parameter. This parameter is only meant to be used by the machinery that performs the listener replacement.
Listeners are no longer subclasses of the TransformerListener class. Previously, the TransformerListener class subclassed Model and stored some state as instance attributes. To perform the replacement, the original listener instance in the downstream component needs to be (deep)copied. However, the implementation of Model.copy doesn't support classes that subclass Model - it merely performs deepcopies of the different Model instance attributes and initializes a new Model instance with them. What this results in is the loss of any state that was directly stored on the listener instance such as upstream_name, name, etc.

To workaround this limitation, all listener state is now directly stored in Model.attrs. This ensures that no persistent state is lost between copies.
A new WrappedTransformerAndListener class has been introduced to be used as the replacement model for the downstream component's original listener. This wraps the upstream transformer pipe's model and the original listener. During training and prediction, it calls the wrapped transformer and directly passes the outputs to the wrapped listener. Gradients are additionally allocated during training, and during prediction, the wrapped listener is instructed to ignore any transformer annotations present on the Docs and use the ones directly stored in the listener.

This PR depends on the following:

Types of change

(Cursed) enhancement

Checklist

I confirm that I have the right to submit this contribution under the project's MIT license.
I ran the tests, and all new and existing tests passed.
My changes don't require a change to the documentation, or if they do, I've added all required information.

Multiple changes were required to facilitate this: * All transformer model entrypoints now have a `wrapped_listener` optional parameter. This parameter is only meant to be used by the machinery that performs the listener replacement. * Listeners are no longer subclasses of the `TransformerListener` class. Previously, the `TransformerListener` class subclassed `Model` and stored some state as instance attributes. To perform the replacement, the original listener instance in the downstream component needs to be (deep)copied. However, the implementation of `Model.copy` doesn't support classes that subclass `Model` - it merely performs deepcopies of the different `Model` instance attributes and initializes a new `Model` instance with them. What this results in is the loss of any state that was directly stored on the listener instance such as `upstream_name`, `name`, etc. To workaround this limitation, all listener state is now directly stored in `Model.attrs`. This ensures that no persistent state is lost between copies. * A new `WrappedTransformerAndListener` class has been introduced to be used as the replacement model for the downstream component's original listener. This wraps the upstream transformer pipe's model and the original listener. During training and prediction, it calls the wrapped transformer and direcly passes the outputs to the wrapped listener. Gradients are additionally allocated during training, and during prediction, the wrapped listener is instructed to ignore any transformer annotations present on the `Doc`s and use the ones directly stored in the listener.

danieldk

A first bunch of comments, I probably need to go over this another time.

spacy_curated_transformers/models/listeners.py

spacy_curated_transformers/pipeline/transformer.py

… the pipe

spacy_curated_transformers/models/listeners.py

Inline listener construction code and remove classes

shadeMe added the enhancement New feature or request label Jul 4, 2023

shadeMe mentioned this pull request Jul 6, 2023

Backport fixes to spacy-3.x explosion/curated-tokenizers#46

Merged

3 tasks

shadeMe added 4 commits July 6, 2023 18:07

Increment curated-tokenizers lowerbound to 0.0.8

4d8c3dc

Increment spacy lowerbound to 3.7.0.dev0

d661771

Restrict spacy requirement upperbound

cfdad49

Fix listener test

89c493b

shadeMe marked this pull request as ready for review July 7, 2023 09:10

danieldk reviewed Jul 10, 2023

View reviewed changes

shadeMe added 5 commits July 10, 2023 13:44

Fix impossible condition in verify_inputs

f0a8426

Raise exception when attempting to register a non-listener model with…

4dd2106

… the pipe

Don't allocate gradients during inference

d75eb0a

Allow 3rd party listeners to use all transformer layer outputs

9027fee

Rename TransformerListener to ListenerStateUtils

b3210f1

danieldk reviewed Jul 10, 2023

View reviewed changes

spacy_curated_transformers/models/listeners.py Outdated Show resolved Hide resolved

Switch to free functions for listener forward functions

dbf1dcb

Inline listener construction code and remove classes

danieldk approved these changes Jul 11, 2023

View reviewed changes

danieldk merged commit 9fd8acf into explosion:main Jul 11, 2023
6 checks passed

shadeMe deleted the feature/replace-listeners-support branch July 11, 2023 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for replacing listeners #7

Add support for replacing listeners #7

shadeMe commented Jul 4, 2023 •

edited

danieldk left a comment

Add support for replacing listeners #7

Add support for replacing listeners #7

Conversation

shadeMe commented Jul 4, 2023 • edited

Description

Types of change

Checklist

danieldk left a comment

Choose a reason for hiding this comment

shadeMe commented Jul 4, 2023 •

edited