Longformer convert error #6465

Maybewuss · 2020-08-13T15:53:23Z

When i install transformers from source and convert bert to "long vesion", failed.

Maybewuss · 2020-08-13T15:54:13Z

Error(s) in loading state_dict for RobertaLongForMaskedLM:
size mismatch for embeddings.position_ids: copying a param with shape torch.Size([1, 512]) from checkpoint, the shape in current model is torch.Size([1, 4096]).

patrickvonplaten · 2020-08-14T08:08:01Z

Hey @Maybewuss,

This is a community notebook, so we don't really plan on maintaining this notebook with current library changes.
Regarding your question I would suggest to post it on https://discuss.huggingface.co/ and/or to contact the author @ibeltagy - maybe he can help you.

Before that it would be nice if you can create a notebook which can be used to re-create your error (replacing RoBERTA with BERT in the above notebook)

alexyalunin · 2020-10-02T00:27:25Z

@patrickvonplaten Is there a way of converting existing 'short' models to Longformer? The notebook above (from allennlp) seem not to be useful since you can't automatically convert their 'long' model to Longformer Huggingface's class. The only way I see is to manually remap nodes.

patrickvonplaten · 2020-10-02T10:33:34Z

Yeah, it is not straight-forward to convert any HF model to its "long" version. You will need to write some special code for this yourself I think. The notebook should work more as an example for how it can be done with a model like Roberta

NadiaRom · 2020-10-29T21:46:27Z

I faced the same error with roberta. Size mismatch was in the position embedding and position ids. Adding the following lines to create_long_model helped:

model.roberta.embeddings.position_embeddings.weight.data = new_pos_embed    # add after this line
model.roberta.embeddings.position_embeddings.num_embeddings = len(new_pos_embed.data)
# first, check that model.roberta.embeddings.position_embeddings.weight.data.shape is correct — has to be 4096 (default) of your desired length
model.roberta.embeddings.position_ids = torch.arange(
    0, model.roberta.embeddings.position_embeddings.num_embeddings
)[None]

For some reason number of embeddings didn't change after adding new weight tensor, so we fix it and also add new position ids.
I use torch==1.6.0 and transformers==3.4.0

MarkusSagen · 2020-11-24T16:23:00Z

@NadiaRom Been trying this implementation, but the forward pass in RobertaLongSelfAttention gets too many inputs in the forward pass.

class RobertaLongSelfAttention(LongformerSelfAttention):
    def forward(
        self,
        hidden_states,
        attention_mask=None,
        head_mask=None,
        encoder_hidden_states=None,
        encoder_attention_mask=None,
        output_attentions=False,
    ):
        return super().forward(hidden_states, attention_mask=attention_mask, output_attentions=output_attentions)

And doesnt work with the current implementation in the transformer library of the forward pass

Any thought on how to solve this and use the conversion script in the current transformers release (3.5.1)?

stale · 2021-01-24T02:33:30Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2021-03-06T00:17:40Z

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

versae · 2021-08-09T15:35:24Z

@MarkusSagen, were you able to solve the forward() issue?

MarkusSagen · 2021-08-09T16:55:51Z

@versae I only looked at it for a couple of hours and decided it was easier to roll back to an earlier version of transformers. If anyone implements a fix, I would be very interested to hear 😊👌

versae · 2021-08-10T16:19:17Z

@MarkusSagen, this PR makes it work for 4.2.0, and with a couple of changes it also works for 4.9.0.

Maybewuss changed the title ~~Longformer conver error~~ Longformer convert error Aug 13, 2020

patrickvonplaten closed this as completed Aug 14, 2020

patrickvonplaten reopened this Aug 14, 2020

MarkusSagen mentioned this issue Nov 24, 2020

Compatibility with latest Huggingface implementation of Longformers? allenai/longformer#140

Open

stale bot added the wontfix label Jan 24, 2021

github-actions bot closed this as completed Mar 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Longformer convert error #6465

Longformer convert error #6465

Maybewuss commented Aug 13, 2020

Maybewuss commented Aug 13, 2020

patrickvonplaten commented Aug 14, 2020

alexyalunin commented Oct 2, 2020 •

edited

patrickvonplaten commented Oct 2, 2020

NadiaRom commented Oct 29, 2020 •

edited

MarkusSagen commented Nov 24, 2020 •

edited

stale bot commented Jan 24, 2021

github-actions bot commented Mar 6, 2021

versae commented Aug 9, 2021

MarkusSagen commented Aug 9, 2021

versae commented Aug 10, 2021

Longformer convert error #6465

Longformer convert error #6465

Comments

Maybewuss commented Aug 13, 2020

Maybewuss commented Aug 13, 2020

patrickvonplaten commented Aug 14, 2020

alexyalunin commented Oct 2, 2020 • edited

patrickvonplaten commented Oct 2, 2020

NadiaRom commented Oct 29, 2020 • edited

MarkusSagen commented Nov 24, 2020 • edited

stale bot commented Jan 24, 2021

github-actions bot commented Mar 6, 2021

versae commented Aug 9, 2021

MarkusSagen commented Aug 9, 2021

versae commented Aug 10, 2021

alexyalunin commented Oct 2, 2020 •

edited

NadiaRom commented Oct 29, 2020 •

edited

MarkusSagen commented Nov 24, 2020 •

edited