<a href="https://colab.research.google.com/github/yoonwanggyu/Self_Study/blob/main/%EC%98%A4%ED%94%84%EB%9D%BC%EC%9D%B8/(%ED%8F%89%EA%B7%A0)Ensemble.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ensemble

- 같은 모델을 쓰되 일부러 한쪽으로 과적합 시키고 평균을 내는 전략도 괜찮음

- 일반화 시키는 것과 과적합 시키는 것을 적절하게 조합

In [25]:
import tensorflow as tf

In [26]:
from tensorflow.keras.applications.resnet50 import ResNet50

In [27]:
from tensorflow.keras import datasets, layers, models

from tensorflow.keras.layers import Dense, Flatten, MaxPooling2D
from tensorflow.keras import Input
from tensorflow.keras.layers import Dropout, BatchNormalization

import matplotlib.pyplot as plt

In [28]:
# 케라스 데이터셋을 다운받아 변수에 각각 넣어준다.
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

In [29]:
train_images.shape, train_labels.shape, test_images.shape, test_labels.shape

((50000, 32, 32, 3), (50000, 1), (10000, 32, 32, 3), (10000, 1))

- sparse categorical crossentropy : Sparse Categorical Cross Entropy는 반대로 label이 정수일 때 사용한다. 기본적으로 데이터 셋이 제공될 때 label이 정수 형태를 띄고 있는 경우가 많은데 이럴 때 이 loss function을 사용한다.

- categorical crossentropy : Categorical Cross Entropy는 데이터 label이 원-핫 인코딩 방식일 때 사용한다. 우리가 to_categorical을 죽어라 사용하는 것도 이것 때문일 확률이 높다.

In [30]:
# sparse categorical crossentropy VS categorical crossentropy + one_hot
y_train = tf.keras.utils.to_categorical(train_labels, 10)
y_test = tf.keras.utils.to_categorical(test_labels, 10)
y_train.shape, y_test.shape

((50000, 10), (10000, 10))

In [31]:
# 라벨 설정
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

### build cnn_model

In [32]:
def cnn_model(n_hidden_node, dropout_prob):
    base_model = ResNet50(include_top=False, input_shape = (32,32 ,3), weights = 'imagenet')
    base_model.trainable = False

    inputs = tf.keras.Input(shape=(32, 32, 3))

    x = base_model(inputs, training=False)
    x = tf.keras.layers.Flatten(input_shape=base_model.output_shape[1:])(x)
    x = tf.keras.layers.Dense(n_hidden_node, activation='relu')(x)
    x= tf.keras.layers.Dropout(dropout_prob)(x)
    outputs = tf.keras.layers.Dense(10, activation='softmax')(x)

    model = tf.keras.Model(inputs, outputs)

    model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate= 0.001),
                  loss = 'categorical_crossentropy',
                  metrics=['accuracy'])
    return model

### train five cnn models

Dropout만 다르게 설정하여 총 5개의 모델을 만듦

In [9]:
cnn_v1 = cnn_model(1024, 0.5)
cnn_v2 = cnn_model(1024, 0.6)
cnn_v3 = cnn_model(1024, 0.7)
cnn_v4 = cnn_model(1024, 0.8)
cnn_v5 = cnn_model(1024, 0.9)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


5개 모델 각각 학습

In [10]:
cnn_v1.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c3320222050>

In [11]:
cnn_v2.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c330f247cd0>

In [12]:
cnn_v3.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c332177b490>

In [13]:
cnn_v4.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c3310eeb8b0>

In [14]:
cnn_v5.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c330f9e78e0>

### evaluate five cnn model

In [15]:
pred_v1 = cnn_v1.predict(test_images)
pred_v2 = cnn_v2.predict(test_images)
pred_v3 = cnn_v3.predict(test_images)
pred_v4 = cnn_v4.predict(test_images)
pred_v5 = cnn_v5.predict(test_images)



In [16]:
import numpy as np
from sklearn.metrics import accuracy_score

y_test = np.argmax(y_test, axis=1)   # y_test 배열의 각 행에서 최대값의 인덱스를 반환합니다.

score_v1 = accuracy_score(np.argmax(pred_v1, axis=1), y_test)
score_v2 = accuracy_score(np.argmax(pred_v2, axis=1), y_test)
score_v3 = accuracy_score(np.argmax(pred_v3, axis=1), y_test)
score_v4 = accuracy_score(np.argmax(pred_v4, axis=1), y_test)
score_v5 = accuracy_score(np.argmax(pred_v5, axis=1), y_test)

print(f'score_v1 : {score_v1}')
print(f'score_v2 : {score_v2}')
print(f'score_v3 : {score_v3}')
print(f'score_v4 : {score_v4}')
print(f'score_v5 : {score_v5}')

score_v1 : 0.5967
score_v2 : 0.595
score_v3 : 0.5794
score_v4 : 0.564
score_v5 : 0.5086


### ensemble 5 cnn model

axis = 0 : 중요 -> 앙상블 할 시 0으로 설정

5개 모델 예측값 평균으로 앙상블

In [17]:
pred_ensemble = np.mean([pred_v1, pred_v2, pred_v3, pred_v4, pred_v5], axis=0)

In [19]:
pred_ensemble

array([3, 1, 8, ..., 5, 2, 7])

In [18]:
pred_ensemble = np.argmax(pred_ensemble, axis=1)
accuracy_score(pred_ensemble, y_test)

0.6009

### (practice)ensemble + 2 cnn model
위 다섯개의 모델을 앙상블한 모델에 아래 조건에 맞는 2개의 모델을 추가로 학습하여 앙상블해봅시다.
- n_hidden_node : 512, dropout_prob : 0.5
- n_hidden_node : 256, dropout_prob : 0.5

In [33]:
cnn_v6 = cnn_model(512, 0.5)
cnn_v7 = cnn_model(256,0.5)

In [34]:
cnn_v6.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c32fd2063b0>

In [35]:
cnn_v7.fit(train_images, y_train, epochs = 1, validation_data=(test_images, y_test), batch_size=128)



<keras.src.callbacks.History at 0x7c32fba27040>

In [36]:
pred_v6 = cnn_v1.predict(test_images)
pred_v7 = cnn_v1.predict(test_images)

print(pred_v6)
print(pred_v7)

[[5.85275283e-03 4.65413556e-03 1.52443657e-02 ... 3.47337546e-03
  5.24012093e-03 1.09879870e-03]
 [4.79215346e-02 3.10375512e-01 6.21480722e-05 ... 4.47705825e-05
  5.21570504e-01 1.19503267e-01]
 [1.60968415e-02 1.09989159e-02 4.08497763e-05 ... 3.21558822e-04
  9.03887212e-01 6.71416298e-02]
 ...
 [5.07463061e-04 1.00349856e-03 7.22040385e-02 ... 5.98934256e-02
  1.11424619e-04 4.06018255e-04]
 [1.05366066e-01 1.47327110e-01 1.67274207e-01 ... 1.61469892e-01
  3.05379126e-02 4.86348756e-02]
 [4.74278932e-05 2.98753775e-06 2.96944083e-04 ... 9.64357376e-01
  5.02814828e-06 6.23182086e-06]]
[[5.85275283e-03 4.65413556e-03 1.52443657e-02 ... 3.47337546e-03
  5.24012093e-03 1.09879870e-03]
 [4.79215346e-02 3.10375512e-01 6.21480722e-05 ... 4.47705825e-05
  5.21570504e-01 1.19503267e-01]
 [1.60968415e-02 1.09989159e-02 4.08497763e-05 ... 3.21558822e-04
  9.03887212e-01 6.71416298e-02]
 ...
 [5.07463061e-04 1.00349856e-03 7.22040385e-02 ... 5.98934256e-02
  1.11424619e-04 4.06018255e-04]

In [38]:
y_test = np.argmax(y_test, axis=1)

score_v6 = accuracy_score(np.argmax(pred_v6, axis=1), y_test)
score_v7 = accuracy_score(np.argmax(pred_v7, axis=1), y_test)

print(f'score_v6 : {score_v6}')
print(f'score_v7 : {score_v7}')

score_v6 : 0.5967
score_v7 : 0.5967


- 기존 5개 + new 2개 앙상블

In [42]:
ensemble = np.mean([pred_v1, pred_v2, pred_v3, pred_v4, pred_v5,pred_v6,pred_v7], axis=0)

In [43]:
ensemble = np.argmax(ensemble, axis=1)
accuracy_score(ensemble, y_test)

0.6043