Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

About Nonlocality #23

Open
yangbang18 opened this issue May 24, 2022 · 0 comments
Open

About Nonlocality #23

yangbang18 opened this issue May 24, 2022 · 0 comments

Comments

@yangbang18
Copy link

yangbang18 commented May 24, 2022

Thanks for your great work and codes.

I am a little bit confused about your implementations of nonlocality (in main.py (L346-351))

Here is the code:

batch = next(iter(data_loader_val))[0]
batch = batch.to(device)
batch = model_without_ddp.patch_embed(batch)
for l in range(len(model_without_ddp.blocks)):
    attn =  model_without_ddp.blocks[l].attn
    nonlocality[l] = attn.get_attention_map(batch).detach().cpu().numpy().tolist()

It seems that you always feed the original patch embeddings to all 12 blocks.
Shouldn't the inputs of attn.get_attention_map be [original patch embeddings, outputs of the block 1, ..., outputs of the block 11]?

If I understand it wrong, please correct me.

Sincerely, looking forward to your reply.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant