New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The size of tensor a (100) must match the size of tensor b (17) at non-singleton dimension 3 #8
Comments
Would you please describe each element of the batch print their shapes like this?
It is weird to seen 17 (the number of labels) appear in the calculation of attentions. |
Thanks for your reply. I print the shape of batch in test_loader and train_loader, they both show: torch.Size([16, 100])
torch.Size([16, 17]) But it works well in training and evaulating orz... I am also confused why 17 in attention for training and evaluating works, but fails in pruning TAT |
It looks like the model has wrongly treated the second tensor ([16,17]) as the attention masks, because:
To solve this, you can either
or
and then call the pruner as:
ps: |
great~ I think you 'd better tip this mentions in your official docs. |
Thanks for your suggestion. I modify my Dataset as you suggest, and it works. And debugging shows the attention_mask was recognized by my label, cuz I didn't return as a dict type. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hello, thanks for your excellent library!
When I intend to prune a pre-trained BERT for 17-classes text classification, my code is:
But it occurs:
I found few materials or tutorials about TextPruner, maybe it is a little bit latest.
Please have a look at this bug when you are free. Thanks in advance!
The text was updated successfully, but these errors were encountered: