ValueError while trying to finetune BERT on multi-label text classification #6483
-
|
At this point I am not sure if this is a bug or a feature that I have not fully understood how to make it work. Apologies if this is the second and the correct avenue is stack overflow. I am trying to connect textcat with the transformer component to fine tune bert on a multilabel text classification problem. I am using a TextCatCNN architecture for textcat even though it might be an overkill since a simple dense layer with sigmoid activations should do but thought to start with a pre registered architecture. I have not managed to resolve whether the error I am getting is because I am doing something wrong or because there is a bug hence the issue. How to reproduce the behaviourthrows Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
|
There's a few things here: While it's true that you need to use Are you actually interested in any of the other components from With spaCy 3, we really recommend using the new config system to train your custom pipelines. The config can source components from existing models if you want to build on top of the pretrained weights of If you want to create an entirely new model with a transformer and a textcat, you can get a basic config with this command: In the resulting file, where it reads you can change the name to any other model from the HF library. Once you start training your That said - we do need to make sure that error is little more user-friendly ;-) |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the quick response Sofie and the explanation plus guidance. I am also experimenting with the config files which seem to work but I am trying to port a class that wraps a spacy v2 custom training loop inside. I made some progress with your suggestion but I am still missing something to make it work. My current implementation throws yet another value error which is less cryptic but that it also hard to resolve as there is no obvious parameter This is where I am following your suggestion. this now throws |
Beta Was this translation helpful? Give feedback.
-
|
Could you try changing to The Also - make sure that your transformer is BEFORE the |
Beta Was this translation helpful? Give feedback.
-
|
aha, so that did the trick 🙏 is there any way i can help with documentation to make that journey easier for people in the future? or you want to disincentivise that custom training loop use case? |
Beta Was this translation helpful? Give feedback.
-
|
Happy to hear it worked! It's just that there are a lot of small details to get right when writing your own custom training loop. I do think the documentation is already extensive, but if you find specific spots where a rephrasing or adding some explanation would help, feel free to submit a PR! And yea, we really do want to encourage people to use the config system more. You get easy support for disabling components, sourcing them from a different model, keeping pretraining and training steps aligned, making sure your experiments are reproducable, etc. So we do focus more of the v3 documentation around that. |
Beta Was this translation helpful? Give feedback.
Could you try changing
to
The
textcatlabels are included in your examples, so they should be deduced automatically, you don't have to calladd_labelspecifically or define thelabelsvariable.Also - make sure that your transformer is BEFORE the
textcatin the pipeline, because theTransformerListenerassumes that the transformer has already processed that batch of text.