Hi,question about sliced-up version self-attention #28

Faith-Uchiha · 2022-05-23T13:48:08Z

in the blog you says there is a more efficient way of implementation? see lecture at the top. Do you mean the youtube vide at the top?

but there is no code explaination in the video , do i have to watch the video and implement myself or any blogs about the more efficient way of self attention?

thanks a lot!

pbloem · 2022-05-23T20:39:10Z

This refers to slide 25-26 in this lectures: https://dlvu.github.io/slides/dlvu.lecture12.pdf Slide 25 shows the basic idea of multi-head self attention, and 26 shows how to implement it efficiently.

This is implemented in the default self-attention here:

former/former/modules.py

Line 9 in ce7af9d

class SelfAttention(nn.Module):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hi,question about sliced-up version self-attention #28

Hi,question about sliced-up version self-attention #28

Faith-Uchiha commented May 23, 2022

pbloem commented May 23, 2022 •

edited

Loading

Hi,question about sliced-up version self-attention #28

Hi,question about sliced-up version self-attention #28

Comments

Faith-Uchiha commented May 23, 2022

pbloem commented May 23, 2022 • edited Loading

pbloem commented May 23, 2022 •

edited

Loading