TextCatCNN.v2 doesn't work with transformers #11968
Labels
bug
Bugs and behaviour differing from documentation
feat / textcat
Feature: Text Classifier
feat / transformer
Feature: Transformer
🔮 thinc
spaCy's machine learning library Thinc
How to reproduce the behaviour
As brought up in #11925, if you use TextCatCNN.v2 with transformers you get an error like this:
The issue is that when initializing the textcat, the linear layer in the model is resized once for each label added. When resized, the linear layer is detected as initialized, and is then re-initialized. However it's not actually initialized at that point, and is missing dimension information because of how transformers initialization works, so initialization fails.
For the time being, the workaround is to use TextCatCNN v1 or another architecture. The main difference in v2 is that it's resizable, so if you aren't using that particular feature performance shouldn't differ significantly.
Info about spaCy
The text was updated successfully, but these errors were encountered: