# ANN 04: ANN 모델 빌드 심화

이번 챕터에서는 ANN 모델을 더 깊이 있게 빌드하는 방법을 배웁니다.

📌 목표:
- 활성화 함수 비교 (ReLU, Sigmoid, Tanh)
- 옵티마이저 비교 (Adam, SGD, RMSprop)
- 드롭아웃(Dropout)으로 과적합 방지


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
import platform
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# 한글 폰트 설정
if platform.system() == 'Linux':
    plt.rcParams['font.family'] = 'NanumGothic'
elif platform.system() == 'Darwin':
    plt.rcParams['font.family'] = 'AppleGothic'
else:
    plt.rcParams['font.family'] = 'Malgun Gothic'

plt.rcParams['axes.unicode_minus'] = False

# 데이터 로드
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

print("✅ 데이터 준비 완료:", x_train.shape, y_train.shape)


2025-09-01 18:28:49.303337: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-01 18:28:49.389797: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-09-01 18:28:51.395835: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.


✅ 데이터 준비 완료: (60000, 28, 28) (60000,)


## 🔑 활성화 함수 비교
- ReLU: 가장 많이 쓰이는 기본값
- Sigmoid: 출력이 0~1, 하지만 깊은 층에서는 기울기 소실 문제 발생
- Tanh: -1~1 범위, Sigmoid보다 나음


In [2]:
model_sigmoid = keras.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation="sigmoid"),
    layers.Dense(10, activation="softmax")
])

model_sigmoid.compile(optimizer="adam",
                      loss="sparse_categorical_crossentropy",
                      metrics=["accuracy"])

history_sigmoid = model_sigmoid.fit(x_train, y_train,
                                    epochs=5,
                                    validation_split=0.1,
                                    batch_size=32,
                                    verbose=0)

acc_sigmoid = model_sigmoid.evaluate(x_test, y_test, verbose=0)[1]
print(f"Sigmoid 모델 정확도: {acc_sigmoid:.4f}")


  super().__init__(**kwargs)
2025-09-01 18:28:52.948307: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


Sigmoid 모델 정확도: 0.8670


In [3]:
model_tanh = keras.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation="tanh"),
    layers.Dense(10, activation="softmax")
])

model_tanh.compile(optimizer="adam",
                   loss="sparse_categorical_crossentropy",
                   metrics=["accuracy"])

history_tanh = model_tanh.fit(x_train, y_train,
                              epochs=5,
                              validation_split=0.1,
                              batch_size=32,
                              verbose=0)

acc_tanh = model_tanh.evaluate(x_test, y_test, verbose=0)[1]
print(f"Tanh 모델 정확도: {acc_tanh:.4f}")


Tanh 모델 정확도: 0.8732


## ⚡ 옵티마이저 비교
- Adam: 적응형 학습률, 대부분 기본값
- SGD: 확률적 경사하강법, 단순하지만 속도 느림
- RMSprop: 순환 신경망(RNN)에서 자주 쓰임


In [4]:
optimizers = ["adam", "sgd", "rmsprop"]
results = {}

for opt in optimizers:
    model = keras.Sequential([
        layers.Flatten(input_shape=(28, 28)),
        layers.Dense(128, activation="relu"),
        layers.Dense(10, activation="softmax")
    ])
    model.compile(optimizer=opt,
                  loss="sparse_categorical_crossentropy",
                  metrics=["accuracy"])
    model.fit(x_train, y_train, epochs=5,
              validation_split=0.1, batch_size=32, verbose=0)
    acc = model.evaluate(x_test, y_test, verbose=0)[1]
    results[opt] = acc

results


{'adam': 0.8711000084877014,
 'sgd': 0.8381999731063843,
 'rmsprop': 0.873199999332428}

## 🛡️ Dropout 적용
- 학습 중 무작위로 뉴런을 꺼서 과적합 방지
- 일반적으로 0.2~0.5 사이 값 사용


In [5]:
model_dropout = keras.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(256, activation="relu"),
    layers.Dropout(0.3),
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.3),
    layers.Dense(10, activation="softmax")
])

model_dropout.compile(optimizer="adam",
                      loss="sparse_categorical_crossentropy",
                      metrics=["accuracy"])

history_dropout = model_dropout.fit(x_train, y_train,
                                    epochs=10,
                                    validation_split=0.1,
                                    batch_size=32,
                                    verbose=0)

acc_dropout = model_dropout.evaluate(x_test, y_test, verbose=0)[1]
print(f"Dropout 모델 정확도: {acc_dropout:.4f}")


Dropout 모델 정확도: 0.8741


## ✅ 정리
- 활성화 함수: ReLU > Tanh > Sigmoid
- 옵티마이저: Adam이 가장 안정적
- Dropout: 과적합 방지에 효과적

👉 다음 챕터: **ANN을 활용한 Fashion-MNIST 실제 분류 프로젝트**
