# 07-2. 심층신경망 
DNN (Deep Neural Network)

---

## multi-layer

(목표) 여러 층으로 구성된 DNN을 만들고, 다양한 activation 함수, optimizer에 대해 알아본다.


### [1] hidden layer (**sigmoid**)
Flatten layer를 사용하지 않고 직접 1차원 배열을 변환 하여 학습시킬 것이다.

#### Load Data

In [38]:
from tensorflow import keras

(X_train, y_train), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()

#### Scaling

확률적 경사하강법은 손실함수의 loss값을 최소화 하는 알고리즘으로, 각 feature 영향력을 표준화해야하므로 전처리를 해야 한다.

이미지 화소는 0 ~ 255값을 가지므로 255로 나눠 0 ~ 1사이 값으로 표준화한다.

In [41]:
train_scaled = X_train / 255.0
test_scaled = X_test / 255.0

#### Split Train dataset for validation set


In [44]:
from sklearn.model_selection import train_test_split

train_scaled = train_scaled.reshape(-1, 28*28)

train_scaled, val_scaled, train_target, val_target = train_test_split(
    train_scaled, y_train, test_size=0.2, random_state=42
)

### 2. 인공신경망 설정

>[ **활성화 함수** ]<br/>
> 선형함수 출력값을 확률로 변환하기 위해 사용한다.
>- 출력층
>    - 이진분류 : sigmoid
>    - 다중분류 : softmax
>- 은닉층
>  - **비선형** 함수
>    - [1] sigmoid
>    - [2] ReLU
>    - [3] sigmoid

In [13]:
import numpy as np
print(np.unique(y_train, return_counts=True))
print(train_scaled.shape)

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000],
      dtype=int64))
(48000, 784)


1) 층 구성-1

In [16]:
# hidden layer
dense1 = keras.layers.Dense(100, activation='sigmoid', input_shape=(784,))
# output layer
dense2 = keras.layers.Dense(10, activation='softmax')


model = keras.Sequential([dense1, dense2])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [18]:
model.summary()

2) 층 구성-2

In [21]:
model = keras.Sequential([
    keras.layers.Dense(100, activation='sigmoid', input_shape=(784,), name='hidden'),
    keras.layers.Dense(10, activation='softmax', name='output')
])

model.summary()

3) 층 구성-3

Sequential의 add()

In [24]:
model = keras.Sequential()
model.add(keras.layers.Dense(100, activation='sigmoid', input_shape=(784,), name='hidden'))
model.add(keras.layers.Dense(10, activation='softmax',  name='output'))

### 3. 훈련 및 검증

In [29]:
model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(train_scaled, train_target, epochs=5)

Epoch 1/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.7512 - loss: 0.7728
Epoch 2/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8474 - loss: 0.4269
Epoch 3/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8637 - loss: 0.3805
Epoch 4/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8715 - loss: 0.3528
Epoch 5/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8774 - loss: 0.3372


<keras.src.callbacks.history.History at 0x1e21ffdd1f0>

In [51]:
model.evaluate(val_scaled, val_target)



[0.3595207631587982, 0.871666669845581]

ANN에 비해 성능이 높아졌음을 확인할 수 있다.

### [2] hidden layer(**ReLU**)

이번에는 train set을 np.reshape을 통해 차원을 변경하는 것이 아닌, **Flatten layer**를 인공신경망에 추가하여 진행해보겠다.

- 층 설정

  맨 처음 층부터 마지막 출력층 순으로 추가한다.


In [52]:
model = keras.Sequential()

model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(100, activation='relu', name='hidden'))
model.add(keras.layers.Dense(10, activation='softmax', name='output'))

In [53]:
model.summary()

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_1 (Flatten)         (None, 784)               0         
                                                                 
 hidden (Dense)              (None, 100)               78500     
                                                                 
 output (Dense)              (None, 10)                1010      
                                                                 
Total params: 79510 (310.59 KB)
Trainable params: 79510 (310.59 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


- Data Preparation

  Flatten 층이 있기에 reshape하지 않은 train set를 다시 만들어야 한다.

In [54]:
(X_train, y_train), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()

# Scaling
train_scaled = X_train / 255.0
test_scaled = X_test / 255.0

# train, val set
train_scaled, val_scaled, train_target, val_target = train_test_split(
    train_scaled, y_train, test_size=0.2, random_state=42
)

- 훈련 및 검증

In [55]:
model.compile(loss='sparse_categorical_crossentropy', metrics='accuracy')
model.fit(train_scaled, train_target, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x79c94d782500>

In [56]:
model.evaluate(val_scaled, val_target)



[0.3419227600097656, 0.8794166445732117]

Sigmoid함수를 썼을 때보다 성능이 조금 더 향상되었다.

## Optimizer

경사하강법 알고리즘

- RMSprop(default)
- SGD

In [59]:
model = keras.Sequential()

model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(100, activation='relu', name='hidden'))
model.add(keras.layers.Dense(10, activation='softmax', name='output'))

In [60]:
model.compile(optimizer='sgd', loss='sparse_categorical_crossentropy', metrics='accuracy')

model.fit(train_scaled, train_target, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x79c936c11750>

- Adam

In [61]:
model = keras.Sequential()

model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(100, activation='relu', name='hidden'))
model.add(keras.layers.Dense(10, activation='softmax', name='output'))

In [62]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')

model.fit(train_scaled, train_target, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x79c94d915420>

In [63]:
model.evaluate(val_scaled, val_target)



[0.33459848165512085, 0.8745833039283752]