-
Notifications
You must be signed in to change notification settings - Fork 15
masked self-attention #6
Comments
Oops, completely forgot to apply the mask to the MHA. I'm currently focused on other tasks so I can't deal with it right away, but if you need it, you can use the following code to improve the forward method of import torch
import numpy as np
def subsequent_mask(size: int) -> torch.Tensor:
attn_shape = (size, size)
mask = np.triu(np.ones(attn_shape, dtype=np.float32), k=1)
mask = torch.from_numpy(mask) == 0
return mask.float()
class EncoderModule:
...
def forward(self, x):
# x.shape == [N, L, E]
mask = subsequent_mask(x.shape[1]) # mask.shape == [L, L]
x = self.attention(x, attn_mask=mask)
... It should work fine. Maybe. |
i'll try it, thanks (maybe)!!! ;-) |
Hi, I tried it but it gave the error:
I'm a novice in pytorch and also in terms of dealing/programming complex models such as this, so maybe i'm making some silly mistake or not seen the easy solution. This is the code I used in the
Did it work for you @LuisMoralesAlonso ? If so, what have you done differently? |
@pedromingues It's because you wrapped MultiheadAttention with ResidualBlock. |
I used the
But this also gave an error, this time:
What should I specifically change for it to work? Can you provide the code of what you think is the solution? Sorry for your time, and thank you very much for the patience. |
I'm sorry. I was wrong, if you want to add an attention mask while using ResidualBlock, fix it here. Line 45 in b579717
|
Sorry for the delay in writing back. So instead of using the Line 45 in b579717
|
yes. like this.
|
Again a question about implementation vs paper. I haven't found in the code any implementation of what they call "masked self-attention".
Hope your comments,
The text was updated successfully, but these errors were encountered: