특정한 패턴의 특징이 어디서 나타나는지를 확인하는 도구

        Convolution (합성곱)

필터
1. 필터셋은 **3차원 형태로 된 가중치**의 모음
2. 필터셋 하나는 앞선 레이어의 결과인 **"특징맵"전체**를 본다.
3. **필터셋 개수 만큼**  특징맵을 만든다.

In [19]:
# 라이브러리 사용
import tensorflow as tf
import pandas as pd

In [20]:
# 데이터 준비
(inde, de), _ = tf.keras.datasets.mnist.load_data()
# 입력의 형태가 3차원이어야 하기 때문에 reshape를 거친다.
# 좀 더 쉽게 이해하려면 color image가 3차원이기 때문에 텐서플로우에서 그렇게 사용하도록 함
# 흑백이미지도 3차원으로 변형해준다.
inde = inde.reshape(60000, 28, 28, 1)
de = pd.get_dummies(de)
print(inde.shape, de.shape)

(60000, 28, 28, 1) (60000, 10)


Convolution layer에서 결정해야하는 사항
1. 필터셋을 몇 개 사용할 것인가
- 아래 코드의 3과 6은 필터셋의 수
2. 필터셋의 사이즈를 얼마로 할 것인가
- 아래 코드의 kernel_size=5는 필터셋의 크기를 5*5로 정하는 것

In [21]:
# 모델 만들기
X = tf.keras.layers.Input(shape=[28, 28, 1]) # 입력의 형태가 3차원이어야 한다.
H = tf.keras.layers.Conv2D(3, kernel_size=5, activation='swish')(X) # 3개의 특징 맵 (3채널의 특징 맵)
H = tf.keras.layers.Conv2D(6, kernel_size=5, activation='swish')(H) # 6개의 특징 맵 (6채널의 특징 맵)
H = tf.keras.layers.Flatten()(H) # 픽셀 단위로 한줄로 펼친다. (표로 만든다.)
H = tf.keras.layers.Dense(84, activation='swish')(H)
Y = tf.keras.layers.Dense(10, activation='softmax')(H)
model = tf.keras.models.Model(X, Y)
model.compile(loss='categorical_crossentropy', metrics='accuracy')

컴퓨터는 Convolution filter를 학습하는 것

위의 모델 대로라면 어느 숫자인지 판단하기 위해 가장 좋은 특징맵 6개를 찾는 것

In [22]:
# 모델 학습
model.fit(inde, de, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f98242b6518>

In [23]:
# 모델 이용
pred = model.predict(inde[0:5])
pd.DataFrame(pred).round(2)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
1,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
3,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


In [24]:
# 정답 확인
de[0:5]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0,0,0,0,0,1,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0
2,0,0,0,0,1,0,0,0,0,0
3,0,1,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,1


모델의 모양을 확인해보면
첫번째 convolution layer의 

필터 크기가 5*5이기 때문에

출력인 특징 맵의 크기는 28에서 (5-1) = 4 만큼 뺀 24가 되고, 

입력의 채널 수가 1, 필터셋의 수가 3이므로 1*3 =3

따라서 (24,24,3)의 결과가 나온다.


parameter의 수는 필터가 가중치에 해당하므로

첫번째 Convolution layer의 경우 5*5*3 + 3 = 78

두번째 Convolution layer의 경우 5*5*3*6 +6 = 456

In [25]:
# 모델 확인
model.summary()

Model: "model_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 24, 24, 3)         78        
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 20, 20, 6)         456       
_________________________________________________________________
flatten_3 (Flatten)          (None, 2400)              0         
_________________________________________________________________
dense_6 (Dense)              (None, 84)                201684    
_________________________________________________________________
dense_7 (Dense)              (None, 10)                850       
Total params: 203,068
Trainable params: 203,068
Non-trainable params: 0
_____________________________________________________

Convolution layer 연산 설명 참고

https://excelsior-cjh.tistory.com/180