About Nonlocality #23

yangbang18 · 2022-05-24T22:03:05Z

Thanks for your great work and codes.

I am a little bit confused about your implementations of nonlocality (in main.py (L346-351))

Here is the code:

batch = next(iter(data_loader_val))[0]
batch = batch.to(device)
batch = model_without_ddp.patch_embed(batch)
for l in range(len(model_without_ddp.blocks)):
    attn =  model_without_ddp.blocks[l].attn
    nonlocality[l] = attn.get_attention_map(batch).detach().cpu().numpy().tolist()

It seems that you always feed the original patch embeddings to all 12 blocks.
Shouldn't the inputs of attn.get_attention_map be [original patch embeddings, outputs of the block 1, ..., outputs of the block 11]?

If I understand it wrong, please correct me.

Sincerely, looking forward to your reply.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Nonlocality #23

About Nonlocality #23

yangbang18 commented May 24, 2022 •

edited

Loading

About Nonlocality #23

About Nonlocality #23

Comments

yangbang18 commented May 24, 2022 • edited Loading

yangbang18 commented May 24, 2022 •

edited

Loading