Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reconstruct the full attention matrix? #33

Open
FarzanT opened this issue Nov 30, 2022 · 2 comments
Open

How to reconstruct the full attention matrix? #33

FarzanT opened this issue Nov 30, 2022 · 2 comments

Comments

@FarzanT
Copy link

FarzanT commented Nov 30, 2022

Hello,

The implementation for the Reformer model allows for the reconstruction of the full attention matrix (https://github.com/lucidrains/reformer-pytorch#research). There, the Recorder class can expand the attention matrix to it's original form.
How can one get this full attention matrix for the Routing transformer? The Recorder class is only compatible with the Reformer transformer.
The full attention matrix is needed for Transformer Interpretability/Explanation, such as the one described here: https://github.com/hila-chefer/Transformer-Explainability

I believe it would involve the lines here:

q = batched_index_select(q, indices)
k = batched_index_select(k, kv_indices)
v = batched_index_select(v, kv_indices)
reshape_with_window = lambda x: x.reshape(b, h, nc, -1, d)
q, k, v = map(reshape_with_window, (q, k, v))
m_k, m_v = map(lambda x: expand_dim(x, 0, b).to(q), (self.mem_key, self.mem_value))
k, v = map(lambda x: torch.cat(x, dim=3), ((m_k, k), (m_v, v)))
dots = torch.einsum('bhnid,bhnjd->bhnij', q, k) * (d ** -0.5)

@KatarinaYuan
Copy link

Hi, have you solved this problem?

@FarzanT
Copy link
Author

FarzanT commented Aug 26, 2023

@KatarinaYuan Hi, unfortunately not, I don't think it's trivial. I decided to use the full attention matrix but with more efficient implementations such as in PyTorch 2.0 and DeepSpeed. Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants