Pytorch BigBird random attention #23055

Bearnardd · 2023-04-28T23:40:05Z

Reproduction

Pytorch->Flax and Flax->Pytorch equivalence tests were failing. At the moment they are skipped by #23040

Expected behavior

During working on #21023 I have found out that there is a bug in pytorch's implementation of BigBird. Namely random attention is used no matter whether we are in training/eval mode. Corect behaviour is that during inference (eval) we should not introduce any randomness, hence we random attention should not be used.

The text was updated successfully, but these errors were encountered:

Bearnardd · 2023-04-29T00:02:50Z

Hi @sanchit-gandhi @ydshieh! I have opened PR that fixes failing tests. I am wondering if the changes in the PR are okay (usage of random attention based on current mode) or do we want to have some more control over usage of random attention e.g. add deterministic argument for __call__ of BigBirdPreTrainedModel. Secondly I was wondering what is the advantage of marking _bigbird_block_rand_mask as a staticmethod and then calling it with self._bigbird_block_rand_mask and passing it arguments from self like self.max_seqlen instead of treating it as a regular method. It looks kinda weird to me. Am I missing something?

sanchit-gandhi · 2023-05-30T18:02:51Z

Closed via #23056.

Bearnardd mentioned this issue Apr 28, 2023

fix random attention for pytorch's bigbird/pegasus_bigbird #23056

Merged

huggingface deleted a comment from github-actions bot May 30, 2023

sanchit-gandhi closed this as completed May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytorch BigBird random attention #23055

Pytorch BigBird random attention #23055

Bearnardd commented Apr 28, 2023 •

edited

Loading

Bearnardd commented Apr 29, 2023

sanchit-gandhi commented May 30, 2023

Pytorch BigBird random attention #23055

Pytorch BigBird random attention #23055

Comments

Bearnardd commented Apr 28, 2023 • edited Loading

Reproduction

Expected behavior

Bearnardd commented Apr 29, 2023

sanchit-gandhi commented May 30, 2023

Bearnardd commented Apr 28, 2023 •

edited

Loading