Stand-Alone Self-Attention in Vision Models #154

chullhwan-song · 2019-06-26T02:09:54Z

chullhwan-song · 2019-10-16T09:07:48Z

Abstract

bert에서 사용되었던 transformer의 concept를 vision분야에..
CNN
- content-based interactions such as self-attention and non-local means
특히, CNN을 대체할수 있는가에 대한 problem에서 시작
- transformer는 RNN에 대한 대체..
self attention > effective stand-alone layer
- 12% fewer FLOPS and 29% fewer parameters 만으로도 성능극복
- self-attention is especially impact-ful when used in later layers.
  - 기존 처음 CNN layer를 사용하면서, 마지막 layer에다가 self attention을 추가하는 방식이 가장 성능이 좋았다는 의미인듯.
transformer의 개념을 CNN 구조에 투영한 연구

NLU에서 적용되었던 Transformer에 영감을 얻는다.
- CNN구조가 아니다.
그래서, Transformer에서의 input값, Q, K, V를 받는다.
- 이 개념들은 Bert 리뷰를 읽거나 여기에 링크된 paper나 post들을 먼저 읽어서 개념을 세운다음 이해해야한다.
다음과같이 표현
- - bert에서와 마찬가지로 linear transform을 의미한다. NN layer를 각각 쓴다는 의미
이 수식을 그림으로 표현하면,
- learned transform > 일단, 위의 식처럼 NN layer를 한번 꺼친다. > learning해야함.
  - bert랑 유사 > 단, scale & mask과정이 없다.
Attention is all you need의 연구처럼 input에 대한 위치정보를 주기 위하여 Sinusoidal embeddings을 었지만, (여기서는 image이니 pixe(i,j)에 대한) 여기서는 relative positional embedding 적용
relative positional embedding이란 다음 밑의 그림처럼, 현재 pixel 기준(원점0, 0)) 대비 위치정보를 주는것.
그래서, relative positional embedding에 정보를 주어지면, 식(2)를 다음과 같이 표현
spatial-relative attention
relative positional embedding의 실제정보는, 식(2) 현재 ab에 대한 relative distance(i, j) 형태로 주어진다.
- relative distance: row 와 column 에 대한 정보 > a − i, b-j 추가되는 형태로
- spatial-relative 정보를 추가하는것같다.
- 논문에서도 써있지만, CNN의 convolution연산과 개념적으로 거의 같지 않나싶다. ㅎ

앞서 언급했던거와 같이, 실제 이 연구는 가장 큰 목적 또는 contribution은 기존 CNN을 쓰지 않고 self attention 모듈만으로 대체할수 란것이 가장 큰 연구이다.
그래서, ResNet CNN 구조를 차용하고 여기에 쓰이는 CNN layer를 self attention으로 교체하여 실험한다.
그러나 이연구에서는 기존 CNN base(=Convolutional Stem)+마지막에 self attention을 부착하는것이 이 연구에서는 좀 더 시너시 효과를 얻었다고 설명
하지만, 이연구에서는 이 연구를 시작으로, 앞으로 NAS같은 실험으로 더 self attentiona만으로도 더 좋은 구조를 찾을수 있지 않을까 예측..ㅎ
- 그럼 두 구조를 합쳐서 NAS개념으로 구조를 찾으면 더더 좋은 구조를 찾을수 있지 않을까?

chullhwan-song added Attention Deep Learning CNN labels Jun 26, 2019

chullhwan-song closed this as completed Jun 26, 2019

chullhwan-song reopened this Oct 17, 2019

chullhwan-song added the Visual Transformers label Jun 12, 2020