New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Attention-Aware Generalized Mean Pooling for Image Retrieval #157

Open

chullhwan-song opened this issue Jun 28, 2019 · 1 comment

Labels

Attention Deep Image Feature Image Retrieval

Owner

chullhwan-song commented Jun 28, 2019

https://arxiv.org/abs/1811.00202

chullhwan-song added Deep Image Feature Image Retrieval labels

chullhwan-song closed this as completed

Owner Author

chullhwan-song commented Oct 8, 2019 •

edited

Loading

PROPOSED METHOD

Network and Pooling

GeM

Attention-Aware GeM

attention-aware GeM (AGeM) descriptor
ResNet-101의 residual block & attention unit의 조합
- 3개의 attention unit : Att1, Att2_1, Att2_2
  - Att1 : 4개의 conv layer
    - 3 × 3, 3 × 3, 1 × 1, and 1 × 1
      - 첫번째만 stride=2, 나머지 1
    - output dim은 1024, 512, 512, 2048
    - 마지막 layer를 제외한, BN & ReLU activation 적용
    - 마지막 layer는 sigmoid function
  - Att2_1 & Att2_2
    - 간단히, kernel size 1 × 1, stride 1 이루어진 하나의 conv layer로만 구성하고 input과 output의 dim이 같다. 그리고 sigmoid 적용
- attention unit과 conv feature map과의 결합
  - ⊗ : Hadamard product == element wise product
  - Fig.1 참조
- final output
  - element wise product 결합이 아닌, 마직막 layer에서는 attention unit과 conv feature map의 + 결합
- 이후, GeM > l2 normalization 적용 > 2048 dimension 크기의 최종 descriptor
- 참고로,
  - 이 feature가 attention-aware features라는 개념이 시각적으로 보여주면 어떨까 싶다??(전혀 없어서..ㅎ)

Loss Function and Whitening

imagenet pre-trained model > finetuning: labelled landmark images를 이용한 분류학습
이후, triplet loss, contrastive loss
PCA

실험

수치가 이해되지 않는다. DIR에서의 수치와 일치되지 않는다.(ctrl+f 로 찾을수 없다.ㅠ)
- † denotes results from the original papers > DIR (맨처음)보면 ㅠ
SP: spatial verification의 약자가 아닐까생각.

결론

attention-aware features 라는 개념을 들고 나왔는데, 좀더 의미적인 설명이 필요하다고 본다.
- 논문 abstract에, "which aims at enhancing more relevant features that correspond to important keypoints in the input image." 설명하곤 있지만,,,
왜 이 feature가 더 좋은지(실험에의해서만..,) 왜 잘 working하는지 친절한 설명이 없다.
기존 attention 개념과 모가 다른지? 아님 같은지? 잘 모르겠다. > Fig.1만보고 이해하라는것같은데..좀더..ㅎ

chullhwan-song reopened this

chullhwan-song added the Attention label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment