Attention maps DINO Patchdrop #6

rraju1 · 2022-02-25T18:34:03Z

Hi, thanks for the amazing paper.

My question is about how which patches are dropped from the image with the DINO model. It looks like in the code in evaluate.py on line 132 head_number = 1. I want to understand the reason why this number was chosen (the other params used to index the attention maps seem to make sense). Wouldn't averaging the attention maps across heads give you better segmentation?

Thanks,

Ravi

The text was updated successfully, but these errors were encountered:

kahnchana · 2022-06-01T03:26:37Z

Thanks for your interest and positive comments. Sorry for the delayed response.

We follow the method used in the DINO code and the selected head seems to result in best segmentations for selected data (measured qualitatively for randomly selected visualizations). It is similar for a different task: quantitative evaluation with PASCAL mIoU.

Average maps reduce performance. I think averaging the attention maps reduces foreground segmentation quality since some heads focus on certain parts of the background (e.g. grass in some images).

rraju1 closed this as completed Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention maps DINO Patchdrop #6

Attention maps DINO Patchdrop #6

rraju1 commented Feb 25, 2022

kahnchana commented Jun 1, 2022

Attention maps DINO Patchdrop #6

Attention maps DINO Patchdrop #6

Comments

rraju1 commented Feb 25, 2022

kahnchana commented Jun 1, 2022