# Pooling Layers

Pooling Layers : window size를 결정하는 역할 = kernel (filter) size의 역할

---
Padding : input과 output의 shpae이 같아져서, neural network를 더 편하게 디자인 가능
- 원래 input의 맨위 맨오른쪽 값은 filter에서 가운데에 올 수가 없는데, padding이 생김으로써 가운데 올 수 있게 되기 때문
  - ex) filter size가 3인 경우, padding을 1로 해주면 input과 output shape 같아짐
  - ex) filter size가 5인 경우, padding을 2로 해주면 input과 output shape 같아짐
- padding이 없을 때 : nH' = nH - f + 1
- padding이 있을 때 : <b>nH' = nH + 2p - f + 1</b>
---
stride : window를 이동할 때 깡충깡충 뛰는 정도
- nH' = [ (nH - f) / s + 1 ]
- \[ \] : 가우스함수
  - 2.5면 2로 바뀌는 거
  - 깡충깡충 뛰다 보니 중간에 끊길 수 있어서 가우스 함수 사용
---
### Output 계산 최종 수식 : nH' = \[ (nH + 2p - f) / s + 1 \]
- ★ Convolutional Layer와 Pooling Layer는 서로 연산의 과정만 다를 뿐이지 Input과 Output을 만들어 내는 과정은 동일함
- 따라서, 이 공식은 Convolutional Layer와 Pooling Layer에 동시에 사용될 수 있음

# 1. Max/Avg Pooling

## 1.1 Max Pooling

아래 코드 실행 결과
```
x: (1, 10, 1)
[-0.0547942   0.40751046  0.85179204 -1.1093632   0.09332124 -2.0045714
 -0.1958938   1.6280512  -1.3979955  -0.46666163]
pooled_max(Tensorflow): (1, 9, 1)
[ 0.40751046  0.85179204  0.85179204  0.09332124  0.09332124 -0.1958938
  1.6280512   1.6280512  -0.46666163]
```

해석
- 첫 번째 window : -0.0547942, 0.40751046 -> 두 개 값 비교해서 큰 값을 pool_max 결과로 return
- 첫 번째 window : 0.40751046, 0.85179204  -> 두 개 값 비교해서 큰 값을 pool_max 결과로 return

In [1]:
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import MaxPooling1D

# length, pool size, strides
L, f, s = 10, 2, 1

# (data개수, length, )
# 3차원으로 만들어줘야 연산 가능
x = tf.random.normal(shape=(1, L, 1))

pool_max = MaxPooling1D(pool_size=f, strides=s)
pooled_max = pool_max(x)

print('x: {}\n{}'.format(x.shape, x.numpy().flatten()))
print('pooled_max (Tensorflow): {}\n{}'.format(pooled_max.shape, pooled_max.numpy().flatten()))

x: (1, 10, 1)
[-0.0547942   0.40751046  0.85179204 -1.1093632   0.09332124 -2.0045714
 -0.1958938   1.6280512  -1.3979955  -0.46666163]
pooled_max(Tensorflow): (1, 9, 1)
[ 0.40751046  0.85179204  0.85179204  0.09332124  0.09332124 -0.1958938
  1.6280512   1.6280512  -0.46666163]


### manual 구현하기

In [2]:
x = x.numpy().flatten()
pooled_max_man = np.zeros((L - f + 1, )) # vector로 생성

for i in range(L-f+1):
    window = x[i:i+f]
    pooled_max_man[i] = np.max(window)

print('pooled_max (Manual): {}\n{}'.format(pooled_max_man.shape, pooled_max_man))    

pooled_max (Manual): (9,)
[ 0.40751046  0.85179204  0.85179204  0.09332124  0.09332124 -0.19589379
  1.62805116  1.62805116 -0.46666163]


## 1.2 Average Pooling

아래 코드 실행 결과
```
x: (1, 10, 1)
[-0.93035865  0.25884935 -0.10767653  0.78458476  0.54243124  0.03321408
  1.5537585   1.4353794  -1.1747631   2.4723911 ]
pooled_avg (Tensorflow): (1, 9, 1)
[-0.33575463  0.07558641  0.33845413  0.663508    0.28782266  0.7934863
  1.494569    0.13030815  0.648814  ]
```

해석
- 첫 번째 window : -0.93035865, 0.25884935 -> 두 개 값 평균을 pool_avg 결과로 return
- 첫 번째 window : 0.25884935, -0.10767653 -> 두 개 값 평균을 pool_avg 결과로 return

In [3]:
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import AveragePooling1D

# length, pool size, strides
L, f, s = 10, 2, 1

# (data개수, length, )
# 3차원으로 만들어줘야 연산 가능
x = tf.random.normal(shape=(1, L, 1))

pool_avg = AveragePooling1D(pool_size=f, strides=s)
pooled_avg = pool_avg(x)

print('x: {}\n{}'.format(x.shape, x.numpy().flatten()))
print('pooled_avg (Tensorflow): {}\n{}'.format(pooled_avg.shape, pooled_avg.numpy().flatten()))

x: (1, 10, 1)
[-0.93035865  0.25884935 -0.10767653  0.78458476  0.54243124  0.03321408
  1.5537585   1.4353794  -1.1747631   2.4723911 ]
pooled_avg (Tensorflow): (1, 9, 1)
[-0.33575463  0.07558641  0.33845413  0.663508    0.28782266  0.7934863
  1.494569    0.13030815  0.648814  ]


### manual 구현하기

In [4]:
x = x.numpy().flatten()
pooled_avg_man = np.zeros((L - f + 1, )) # vector로 생성

for i in range(L-f+1):
    window = x[i:i+f]
    pooled_avg_man[i] = np.mean(window)

print('pooled_max (Manual): {}\n{}'.format(pooled_avg_man.shape, pooled_avg_man))    

pooled_max (Manual): (9,)
[-0.33575463  0.07558641  0.33845413  0.663508    0.28782266  0.7934863
  1.49456894  0.13030815  0.64881402]


# 2. 2D Max/Avg Pooling

## 2.1 2D Max Pooling

In [11]:
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import MaxPooling2D

N, n_H, n_W, n_C = 1, 5, 5, 1
f, s = 2, 1     # window size : 2 * 2

x = tf.random.normal(shape=(N, n_H, n_W, n_C))
pool_max = MaxPooling2D(pool_size=f, strides=s)
pooled_max = pool_max(x)

print('x: {}\n{}'.format(x.shape, x.numpy().flatten()))
print('pooled_avg (Tensorflow): {}\n{}'.format(pooled_max.shape, pooled_max.numpy().squeeze()))

x: (1, 5, 5, 1)
[-0.40581006 -0.7755887   0.4536614  -0.60249895  1.1758922   1.3404404
 -0.08737396 -0.21251349 -0.7482148   0.14563105  1.1568428  -0.6773185
  0.3421037   1.1991739   0.17702095  1.1102656   0.5061705  -0.09806207
 -0.39703697  0.59936136 -0.7069311   0.6208182   0.3359118   0.64189774
  0.5050034 ]
pooled_avg (Tensorflow): (1, 4, 4, 1)
[[1.3404404  0.4536614  0.4536614  1.1758922 ]
 [1.3404404  0.3421037  1.1991739  1.1991739 ]
 [1.1568428  0.5061705  1.1991739  1.1991739 ]
 [1.1102656  0.6208182  0.64189774 0.64189774]]


### manual 구현하기

In [12]:
x = x.numpy().squeeze()
pooled_max_man = np.zeros(shape=(n_H - f + 1, n_W - f + 1))
for h in range(n_H - f + 1):
    for w in range(n_W - f + 1):
        window = x[h:h+f, w:w+f]
        pooled_max_man[h, w] = np.max(window)
        
print('pooled_max (Manual): {}\n{}'.format(pooled_max_man.shape, pooled_max_man))

pooled_max (Manual): (4, 4)
[[1.34044039 0.45366141 0.45366141 1.17589223]
 [1.34044039 0.34210369 1.19917393 1.19917393]
 [1.15684283 0.50617051 1.19917393 1.19917393]
 [1.11026561 0.6208182  0.64189774 0.64189774]]


## 2.2 2D Average Pooling

In [13]:
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import AveragePooling2D

N, n_H, n_W, n_C = 1, 5, 5, 1
f, s = 2, 1     # window size : 2 * 2

x = tf.random.normal(shape=(N, n_H, n_W, n_C))
pool_avg = AveragePooling2D(pool_size=f, strides=s)
pooled_avg = pool_avg(x)

print('x: {}\n{}'.format(x.shape, x.numpy().flatten()))
print('pooled_avg (Tensorflow): {}\n{}'.format(pooled_avg.shape, pooled_avg.numpy().squeeze()))

x: (1, 5, 5, 1)
[-5.9679133e-01 -1.1769327e+00 -6.7112714e-01  2.1431227e+00
 -7.9815680e-01  3.4782854e-01 -4.7258869e-01  1.3070526e+00
  1.4428805e-02  2.2732444e+00  1.2540437e+00 -6.9099069e-01
  3.5259512e-01  1.2975372e-03  2.3868905e-01  8.6324334e-01
  4.5573133e-01  1.3543911e+00 -1.0269698e+00 -1.1458447e+00
 -5.4248653e-02  1.1842662e-01  1.0843079e+00 -3.5333827e-01
 -5.5647647e-01]
pooled_avg (Tensorflow): (1, 4, 4, 1)
[[-0.47462106 -0.253399    0.69836926  0.9081598 ]
 [ 0.10957322  0.1240171   0.41884354  0.6319149 ]
 [ 0.4705069   0.36793172  0.1703285  -0.483207  ]
 [ 0.34578818  0.75321424  0.26459774 -0.7706573 ]]


### manual 구현하기

In [14]:
x = x.numpy().squeeze()
pooled_avg_man = np.zeros(shape=(n_H - f + 1, n_W - f + 1))
for h in range(n_H - f + 1):
    for w in range(n_W - f + 1):
        window = x[h:h+f, w:w+f]
        pooled_avg_man[h, w] = np.mean(window)
        
print('pooled_max (Manual): {}\n{}'.format(pooled_avg_man.shape, pooled_avg_man))

pooled_max (Manual): (4, 4)
[[-0.47462106 -0.25339901  0.69836926  0.90815979]
 [ 0.10957322  0.1240171   0.41884354  0.63191491]
 [ 0.47050691  0.36793172  0.1703285  -0.48320699]
 [ 0.34578818  0.75321424  0.26459774 -0.7706573 ]]


# 3. 3D Max/Avg Pooling

## 3.1 3D Max Pooling

In [29]:
import math
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import MaxPooling2D

N, n_H, n_W, n_C = 1, 5, 5, 3
f, s = 2, 2

x = tf.random.normal(shape=(N, n_H, n_W, n_C))

# channel-wise로 보고 싶으니, np.transpose를 사용해서 돌려주기
print("x: {}\n{}".format(x.shape, np.transpose(x.numpy().squeeze(), (2, 0, 1))))  # 5 * 5 이미지 3장

x: (1, 5, 5, 3)
[[[ 1.6771410e+00 -3.9007729e-01  2.0364550e-01  6.4003956e-01
   -2.6975399e-01]
  [-9.2723799e-01  7.7468532e-01 -2.2106960e+00 -1.1502150e+00
    1.0415221e+00]
  [-5.9262133e-01 -1.1357396e+00 -1.8268001e+00  5.4935777e-01
   -4.4481173e-02]
  [-2.2444602e-03 -8.9262640e-01 -4.2600691e-01  3.9651516e-01
    1.6725726e+00]
  [ 7.5347102e-01 -3.4644222e-01 -5.8876038e-01 -5.9783936e-01
    1.1872431e+00]]

 [[ 5.2150571e-01 -1.4701381e+00  1.6722034e+00  3.8269854e-01
    6.4469194e-01]
  [ 8.3342826e-01  7.1976370e-01 -2.1430194e-01  2.6388904e-01
   -1.3066710e+00]
  [-6.3475721e-02 -6.1799431e-01  5.9652930e-01 -6.4617723e-01
   -1.2043085e+00]
  [ 1.4801111e+00  8.5217714e-02  6.9329768e-02 -1.8431808e+00
   -4.1608950e-01]
  [ 6.9367445e-01 -1.4717045e+00  5.5328530e-01 -1.0335166e+00
    9.2701870e-01]]

 [[-8.5099608e-01 -8.6899781e-01 -1.4776031e+00 -2.9475066e-01
    1.6979058e-01]
  [-9.0862882e-01  2.7692108e+00  6.9684660e-01 -1.7487504e+00
   -1.3322046e-

In [30]:
pool_max = MaxPooling2D(pool_size=f, strides=s)
pooled_max = pool_max(x)

pooled_max_t = np.transpose(pooled_max.numpy().squeeze(), (2, 0, 1))

print('pooled_max (Tensorflow): {}\n{}'.format(pooled_max.shape, pooled_max_t))

pooled_max (Tensorflow): (1, 2, 2, 3)
[[[ 1.6771410e+00  6.4003956e-01]
  [-2.2444602e-03  5.4935777e-01]]

 [[ 8.3342826e-01  1.6722034e+00]
  [ 1.4801111e+00  5.9652930e-01]]

 [[ 2.7692108e+00  6.9684660e-01]
  [ 5.4457647e-01  1.9193697e+00]]]


### manual 구현하기

In [31]:
x = x.numpy().squeeze()
n_H_ = math.floor((n_H - f) / s + 1)
n_W_ = math.floor((n_W - f) / s + 1)
print(n_H_, n_W_)

2 2


In [32]:
pooled_max_man = np.zeros(shape=(n_H_, n_W_, n_C))
print(pooled_max_man.shape)

(2, 2, 3)


In [33]:
for c in range(n_C):
    c_image = x[:, :, c]  # 3장 중 첫 번째 장부터 차례로 가져옴
    
    # 0부터 n_H - f + 1까지 가는데 'stride'만큼 깡충 뛰면서 input을 scan해야 함
    # input에서는 stride만큼 (2칸씩) 뛰었어도, output에는 1칸씩 이동하면서 결과물을 return해야 함
    # = input 대비 output의 사이즈가 줄어듦 (input과 output의 indexing이 다름)
    # → w_, h_ 로 일종의 trick 사용해서 코딩하기
    h_ = 0
    for h in range(0, n_H - f + 1, s):
        w_ = 0
        for w in range(0, n_W - f + 1, s):
            window = c_image[h:h+f, w:w+f]
            pooled_max_man[h_, w_, c] = np.max(window)
            w_ += 1
        h_ += 1

pooled_max_t = np.transpose(pooled_max_man, (2, 0, 1))
print('pooled_max (Manual): {}\n{}'.format(pooled_max_man.shape, pooled_max_t))

pooled_max (Manual): (2, 2, 3)
[[[ 1.67714095e+00  6.40039563e-01]
  [-2.24446016e-03  5.49357772e-01]]

 [[ 8.33428264e-01  1.67220342e+00]
  [ 1.48011112e+00  5.96529305e-01]]

 [[ 2.76921082e+00  6.96846604e-01]
  [ 5.44576466e-01  1.91936970e+00]]]


# 4. Padding

## 4.1 ZeroPadding2D Layer

In [35]:
import numpy as np
import tensorflow as tf

from tensorflow.keras.layers import ZeroPadding2D

images = tf.random.normal(shape=(1,3,3,3))
print(np.transpose(images.numpy().squeeze(), (2,0,1)))

[[[-1.0773331   1.2158711  -1.6318449 ]
  [-1.0978109   2.3492126   0.9505234 ]
  [-1.2878962   1.2512544   0.6651746 ]]

 [[-0.84229577  1.2163715  -0.45278594]
  [-0.67890775  0.41621312  0.32558224]
  [-1.2423313  -2.8177445   0.3392142 ]]

 [[-1.5042193   0.682055   -1.76488   ]
  [-2.5587578   0.65947354  1.6253126 ]
  [-0.769456    1.980295   -0.73028654]]]


<b>위 x에 padding을 1만큼 했을 때 : 상/하/좌/우를 1이 둘러쌈</b>

In [36]:
zero_padding = ZeroPadding2D(padding=1)
y = zero_padding(images)
print(np.transpose(y.numpy().squeeze(), (2,0,1)))

[[[ 0.          0.          0.          0.          0.        ]
  [ 0.         -1.0773331   1.2158711  -1.6318449   0.        ]
  [ 0.         -1.0978109   2.3492126   0.9505234   0.        ]
  [ 0.         -1.2878962   1.2512544   0.6651746   0.        ]
  [ 0.          0.          0.          0.          0.        ]]

 [[ 0.          0.          0.          0.          0.        ]
  [ 0.         -0.84229577  1.2163715  -0.45278594  0.        ]
  [ 0.         -0.67890775  0.41621312  0.32558224  0.        ]
  [ 0.         -1.2423313  -2.8177445   0.3392142   0.        ]
  [ 0.          0.          0.          0.          0.        ]]

 [[ 0.          0.          0.          0.          0.        ]
  [ 0.         -1.5042193   0.682055   -1.76488     0.        ]
  [ 0.         -2.5587578   0.65947354  1.6253126   0.        ]
  [ 0.         -0.769456    1.980295   -0.73028654  0.        ]
  [ 0.          0.          0.          0.          0.        ]]]


0들이 채워져서 크기가 커짐

In [37]:
print(images.shape)
print(y.shape)

(1, 3, 3, 3)
(1, 5, 5, 3)


## 4.2 Zero Padding with Conv2D Layers

ZeroPadding2D를 쓰지 않고 Conv2D를 활용해서도 만들 수 있음
- 주로 padding='same'/'valid' 사용
- padding='valid' : padding을 안 쓰겠다는 의미
- padding='same' : kernel이 움직일 때, corner case를 계산해서 <b>input과 output의 shape이 바뀌지 않도록 padding 자동 설정</b> ★
  - kernel_size=3 일 때 (3 - 1) / 2 를 padding하면 input/output shape 안 바뀜

In [38]:
import tensorflow as tf

from tensorflow.keras.layers import Conv2D

images = tf.random.normal(shape=(1,28,28,3))
conv = Conv2D(filters=1, kernel_size=3, padding='same')
y = conv(images)
print(y.shape)

(1, 28, 28, 1)


# 5. Strides

★ Conv2D vs Pooling : stride가 똑같은 경우
- 공통점 : Output의 height, width 결과 똑같이 나옴
- 차이점 : Channel 개수
  - Conv2D는 filter의 개수에 따라 channel이 정해짐
  - MaxPooling은 각 채널마다 MaxPooling을 해 줌

## 5.1 Strides in Conv2D Layers

In [39]:
import tensorflow as tf

from tensorflow.keras.layers import Conv2D

images = tf.random.normal(shape=(1,28,28,3))
conv = Conv2D(filters=1, kernel_size=3, padding='valid', strides=2)  # padding='valid' : padding 안함
y = conv(images)

print(images.shape)
print(y.shape)  # 앞에서 배운 공식이 그대로 적용된 것 확인할 수 있음

(1, 28, 28, 3)
(1, 13, 13, 1)


## 5.2 Strides in Pooling Layers

In [41]:
import tensorflow as tf

from tensorflow.keras.layers import MaxPooling2D

images = tf.random.normal(shape=(1,28,28,3))
pool = MaxPooling2D(pool_size=3, strides=2)  # pool_size : 위에서 kernel_size와 똑같이 설정
y = pool(images)

print(images.shape)
print(y.shape)

(1, 28, 28, 3)
(1, 13, 13, 3)
