-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Longformer convert error #6465
Comments
Error(s) in loading state_dict for RobertaLongForMaskedLM: |
Hey @Maybewuss, This is a community notebook, so we don't really plan on maintaining this notebook with current library changes. Before that it would be nice if you can create a notebook which can be used to re-create your error (replacing RoBERTA with BERT in the above notebook) |
@patrickvonplaten Is there a way of converting existing 'short' models to Longformer? The notebook above (from allennlp) seem not to be useful since you can't automatically convert their 'long' model to Longformer Huggingface's class. The only way I see is to manually remap nodes. |
Yeah, it is not straight-forward to convert any HF model to its "long" version. You will need to write some special code for this yourself I think. The notebook should work more as an example for how it can be done with a model like Roberta |
I faced the same error with roberta. Size mismatch was in the position embedding and position ids. Adding the following lines to
For some reason number of embeddings didn't change after adding new weight tensor, so we fix it and also add new position ids. |
@NadiaRom Been trying this implementation, but the forward pass in class RobertaLongSelfAttention(LongformerSelfAttention):
def forward(
self,
hidden_states,
attention_mask=None,
head_mask=None,
encoder_hidden_states=None,
encoder_attention_mask=None,
output_attentions=False,
):
return super().forward(hidden_states, attention_mask=attention_mask, output_attentions=output_attentions) And doesnt work with the current implementation in the transformer library of the forward pass Any thought on how to solve this and use the conversion script in the current transformers release (3.5.1)? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions. If you think this still needs to be addressed please comment on this thread. |
@MarkusSagen, were you able to solve the |
@versae I only looked at it for a couple of hours and decided it was easier to roll back to an earlier version of transformers. If anyone implements a fix, I would be very interested to hear 😊👌 |
@MarkusSagen, this PR makes it work for 4.2.0, and with a couple of changes it also works for 4.9.0. |
When i install transformers from source and convert bert to "long vesion", failed.
The text was updated successfully, but these errors were encountered: