# Cats vs Dogs 분류

* CNN (Convolution Neural network) 활용한 분류 모델 (Classification)
* tensorflow-datasets 를 활용한 데이터 전처리

## < 순서 >
### 1. 모듈 임포트
### 2. 데이터셋 로드
### 3. 데이터 전처리
### 4. 모델 정의 (Sequential)
### 5. 모델 컴파일
### 6. 모델 체크포인트 생성
### 7. 모델 학습 (fit)
### 8. model.load_weights( ) : 모델 체크포인트 가중치 로드


## 1. 모듈 임포트

In [None]:
import tensorflow_datasets as tfds
import tensorflow as tf

from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.applications import VGG16

## 2. Dataset 로드
**tensorflow-datasets**를 활용

In [None]:
dataset_name = 'cats_vs_dogs'

# 처음 80%의 데이터만 사용
train_dataset = tfds.load(name=dataset_name, split='train[:80%]')

# 최근 20%의 데이터만 사용
valid_dataset = tfds.load(name=dataset_name, split='train[80%:]')

[1mDownloading and preparing dataset cats_vs_dogs/4.0.0 (download: 786.68 MiB, generated: Unknown size, total: 786.68 MiB) to /root/tensorflow_datasets/cats_vs_dogs/4.0.0...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…







HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))



Shuffling and writing examples to /root/tensorflow_datasets/cats_vs_dogs/4.0.0.incompleteN1RY3B/cats_vs_dogs-train.tfrecord


HBox(children=(FloatProgress(value=0.0, max=23262.0), HTML(value='')))

[1mDataset cats_vs_dogs downloaded and prepared to /root/tensorflow_datasets/cats_vs_dogs/4.0.0. Subsequent calls will reuse this data.[0m


## 3. 데이터 전처리

1. 이미지 정규화 (Normalization)
2. 이미지 사이즈 맞추기: (224 X 224) 
3. image(x), label(y)를 분할

**[실습코드]**

In [None]:
def preprocess(data):
    # x, y 데이터를 정의합니다.
    x = data['image']
    y = data['label']
    # image 정규화(Normalization)
    x = x / 255
    # 사이즈를 (224, 224)로 변환합니다.
    x = tf.image.resize(x, size=(224, 224))
    # x, y  데이터를 return 합니다.
    return x, y

만든 전처리 함수(preprocessing)를 **dataset에 mapping**하고, **batch_size도 지정**합니다.

In [None]:
batch_size=32

In [None]:
train_data = train_dataset.map(preprocess).batch(batch_size)
valid_data = valid_dataset.map(preprocess).batch(batch_size)

### 4. 모델 정의 (Sequential)

transfer learning 기법을 통해 VGG16 모델을 활용한 전이학습 모델을 완성

In [None]:
transfer_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
transfer_model.trainable=False

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


In [None]:
model = Sequential([
    transfer_model,
    Flatten(),
    Dropout(0.5),
    Dense(512, activation='relu'),
    Dense(128, activation='relu'),
    Dense(2, activation='softmax'),
])

In [None]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg16 (Functional)           (None, 7, 7, 512)         14714688  
_________________________________________________________________
flatten_1 (Flatten)          (None, 25088)             0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 25088)             0         
_________________________________________________________________
dense_3 (Dense)              (None, 512)               12845568  
_________________________________________________________________
dense_4 (Dense)              (None, 128)               65664     
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 258       
Total params: 27,626,178
Trainable params: 12,911,490
Non-trainable params: 14,714,688
_________________________________

## 5. 컴파일 (compile)

In [None]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['acc'])

## 6. ModelCheckpoint: 체크포인트 생성

In [None]:
checkpoint_path = "my_checkpoint.ckpt"
checkpoint = ModelCheckpoint(filepath=checkpoint_path, 
                             save_weights_only=True, 
                             save_best_only=True, 
                             monitor='val_loss', 
                             verbose=1)

## 7. 학습 (fit)

In [None]:
model.fit(train_data,
          validation_data=(valid_data),
          epochs=20,
          callbacks=[checkpoint],
          )

Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.21050, saving model to my_checkpoint.ckpt
Epoch 2/20

Epoch 00002: val_loss improved from 0.21050 to 0.16380, saving model to my_checkpoint.ckpt
Epoch 3/20

Epoch 00003: val_loss did not improve from 0.16380
Epoch 4/20

Epoch 00004: val_loss did not improve from 0.16380
Epoch 5/20

Epoch 00005: val_loss improved from 0.16380 to 0.15976, saving model to my_checkpoint.ckpt
Epoch 6/20

Epoch 00006: val_loss did not improve from 0.15976
Epoch 7/20

Epoch 00007: val_loss did not improve from 0.15976
Epoch 8/20

Epoch 00008: val_loss did not improve from 0.15976
Epoch 9/20

Epoch 00009: val_loss did not improve from 0.15976
Epoch 10/20

Epoch 00010: val_loss did not improve from 0.15976
Epoch 11/20

Epoch 00011: val_loss did not improve from 0.15976
Epoch 12/20

Epoch 00012: val_loss did not improve from 0.15976
Epoch 13/20

Epoch 00013: val_loss did not improve from 0.15976
Epoch 14/20

Epoch 00014: val_loss did not improve from 0.159

<tensorflow.python.keras.callbacks.History at 0x7fa735023290>

## 8. 학습 완료 후 Load Weights (ModelCheckpoint)

In [None]:
# checkpoint 를 저장한 파일명을 입력합니다.
model.load_weights(checkpoint_path)

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7fa7203df990>