Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make resizable layer work with textcat and transformers #820

Merged
merged 4 commits into from Feb 22, 2023

Conversation

polm
Copy link
Contributor

@polm polm commented Dec 13, 2022

This PR is an attempted fix of the issue outlined in explosion/spaCy#11968. I don't completely understand how the resizable textcat is supposed to work yet, and am not sure if this is the right approach, but locally it got rid of errors and I was able to train a model with reasonable performance.

There are also no tests yet.

@polm polm added bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers feat / transformer labels Dec 13, 2022
@svlandeg
Copy link
Member

svlandeg commented Dec 13, 2022

Can you provide a minimal (non-) working example that we can use as test case for spaCy?

@polm
Copy link
Contributor Author

polm commented Dec 14, 2022

Made a failing test: explosion/spacy-transformers#357. For testing with a realistic pipeline, you can use the goemotions sample project with transformers and just change the TextCatCNN architecture to v2.

This avoids setting nO if it doesn't need to be changed in the first
place.
@polm
Copy link
Contributor Author

polm commented Dec 29, 2022

I would like someone more familiar with the resizable layer to look at this, but I believe the current fix is correct. To clarify what's going on:

Before this PR, the layer is assumed to be initialized if nO is set. This causes an error in the TextCatCNN when used with transformers, because the number of classes (nO) is known (or at least temporarily assumed) when the textcat component is created, but the size of the transformer embedding (nI) is not known until later.

After this PR, the resizable layer is considered uninitialized if either nO or nI is unset. Because nO may already be set in that case, the new nO is set using force=True.

@svlandeg svlandeg linked an issue Jan 5, 2023 that may be closed by this pull request
@polm polm marked this pull request as ready for review January 6, 2023 05:34
Comment on lines +62 to +64
elif layer.has_dim("nI") is None:
layer.set_dim("nO", new_nO, force=True)
return layer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, Paul. Seems correct to me!

Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good to merge and will make the test in explosion/spacy-transformers#357 green.

@svlandeg svlandeg merged commit 997731a into explosion:master Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers feat / transformer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TextCatCNN.v2 doesn't work with transformers
2 participants