<a href="https://colab.research.google.com/github/ESJoGithub/PythonStudy/blob/main/Python_220809_Softmax.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **소프트맥스(Softmax) 실습**
___
> 다중 분류 로지스틱 회귀 실습

#### **라이브러리 import**


In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.utils import to_categorical         # Ont-hot encoding 여러 벡터 중 하나의 값만 1로 하여 기준으로 삼음 
from tensorflow.keras.datasets import mnist               # 손글씨 데이터

####**데이터set 수집 및 탐색**

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print("train_data(count, row, column): " + str(X_train.shape))                  # (28 pixels x 28 pixels = 784 pixels) x 60000 개
print("test_data(count, row, column): " + str(X_test.shape))                    # (28 pixels x 28 pixels = 784 pixels) x 10000 개

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
train_data(count, row, column): (60000, 28, 28)
test_data(count, row, column): (10000, 28, 28)


In [3]:
print(X_train[0])                                                               # 1번 이미지 각 픽셀에 할당된 값

[[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   3  18  18  18 126 136
  175  26 166 255 247 127   0   0   0   0]
 [  0   0   0   0   0   0   0   0  30  36  94 154 170 253 253 253 253 253
  225 172 253 242 195  64   0   0   0   0]
 [  0   0   0   0   0   0   0  49 238 253 253 253 253 253 253 253 253 251
   93  82  82  56  39   0   0   0   0   0]
 [  0   0   0   0   0   0   0  18 219 253 253 253 253 253 198 18

####**데이터 정규화**
---
Min-Max 정규화: 최대값 255로 나누어 0~1 사이 값으로 scaling

In [4]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

print(X_train[0])

[[0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.    

In [5]:
print("train target(count): " + str(y_train.shape))
print("test target(count): " + str(y_test.shape))

train target(count): (60000,)
test target(count): (10000,)


In [6]:
print("sample from train: " + str(y_train[0]))
print("sample from test: " + str(y_test[0]))

sample from train: 5
sample from test: 7


####**데이터 단순화**

In [7]:
# 28 x 28 array를 1 x 784로 변환
input_dim = 784
X_train = X_train.reshape(60000, input_dim)
X_test = X_test.reshape(10000, input_dim)

print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(60000, 784)
(60000,)
(10000, 784)
(10000,)


####**소프트맥스**

In [8]:
# to_categorical을 통해서 가장 확률이 높은 카테고리를 1로, 나머지를 0으로 one-hot-encoding
num_classes = 10
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)
print(y_train[0])
print(y_test[0])

[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]


#### **모델 학습**

In [9]:
model = Sequential()
# 784개의 데이터를 input 하여 10개 중 하나의 데이터로 출력
model.add(Dense(input_dim = input_dim, units = 10, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer='sgd', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size = 2048, epochs = 100, verbose = 0)

<keras.callbacks.History at 0x7f29e2b83310>

#### **모델 테스트**

In [10]:
score = model.evaluate(X_test, y_test)
print("Test accuracy: ", score[1])

Test accuracy:  0.8906999826431274


#### **모델 요약**

In [11]:
# w: 784개 + b: 1개 즉, 785 * 10개 = 총 7850 params
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 10)                7850      
                                                                 
Total params: 7,850
Trainable params: 7,850
Non-trainable params: 0
_________________________________________________________________


In [13]:
# (w1, w2, ..., w784) * 10, (b1, b2, ..., b10)
model.layers[0].get_weights()

[array([[-0.07682078,  0.00281242, -0.05164642, ..., -0.06728801,
          0.01431503,  0.01354616],
        [ 0.08436514, -0.06946496, -0.08095019, ..., -0.06240658,
          0.02927181, -0.03011797],
        [ 0.00531667, -0.0665945 ,  0.04970215, ...,  0.07306484,
         -0.0698002 ,  0.08330605],
        ...,
        [ 0.0192989 , -0.017769  ,  0.04234835, ...,  0.01416206,
          0.00485062, -0.06023216],
        [-0.02951824,  0.04708681, -0.04098768, ...,  0.05143451,
          0.07796172, -0.05307161],
        [-0.03955812,  0.00792739, -0.03665379, ..., -0.02258029,
         -0.00876009,  0.05166209]], dtype=float32),
 array([-0.06339169,  0.1803238 , -0.04366475, -0.05248709,  0.05515117,
         0.15880404, -0.02960034,  0.09849787, -0.25631952, -0.04731343],
       dtype=float32)]