Residual Attention Networks for Image Classification #35

chullhwan-song · 2018-08-03T00:04:16Z

chullhwan-song · 2018-08-03T00:30:54Z

Residual Unit, ResNeXt, Inception 에 쉽게 모듈하여 결합할수 있다.
- 그럼 내가 필요한 CNN 모듈에서는 ㅠ
- 이렇게 주장하지만, 실제로는 그렇지 않다라고 생각됨.(해보니 어려움)
크게 두가지 부분 > 이두부분이 joint하여 구성 -> highway network의 방식 모사.
- mask branch()= soft mask branch
  - bottom-up and Top-Down 구조
  - residual unit과 max pooling의 조합
  - 일종의 de-convolution를 이용한 segmentation과 구조와 유사
    - down sampling vs up sampling 구조
    - up sampling 할때, Linear interpolation
      - Linear interpolation의 수는 max pooling동일하다.
    - 이후, 1x1 conv 를 두번한다. 왜 하지?? > 차원을 줄이려는듯~
    - 최종적으로, sigmoid layer -normalize [0, 1]
- trunk branch()
  - bottom-up 과 top-down (M(x))와 T(x)는 skip-connection

* i ranges over all spatial positions
* c is channel index
* 수식 (1)의 back-propa 가능

* 는 branch parameter, 는 trunk parameter

Attention Residual Learning = resnet과의 결합.
- F는 original feature
- 이 두구조를 가지는 속성은 robust to noisy labels.
Spatial Attention and Channel Attention
- soft mask output 전에 activation function 안에서 normalization step 형태로써 변환를 통해 mask branch와 결합가능 > 그니까 여기서는 이 activation function를 애기하려는듯~
  - ?? > 이부분이 저런
- three types of activation functions > constrains to attention can still be added to mask branch by changing normalization step in activation function before soft mask output.
- Mixed attention f1 : Mixed attention f1 without additional restriction use simple sigmoid for each channel and spatial position
- Channel attention f2 performs L2 normalization within all channels for each spatial position to remove spatial information.
- Spatial attention f3 performs normalization within feature map from each channel
- i ranges over all spatial positions
- c ranges over all channels.
- mean_c and std_c denotes the mean value and standard deviation of feature map from c-th channel
- x_i denotes the feature vector at the ith spatial position.
전체 network 구조

chullhwan-song added Attention CNN labels Aug 3, 2018

Provide feedback