# Lab 16-1 Softmax Classifier

- 목표: Softmax classifier를 Tensorflow로 직접 구현

## Softmax function

- softmax classifier는 여러개의 클래스를 예측할 때 매우 유용한 방법

![](images/16/softmax_function.jpg)

- softmax
  - score를 확률로 변환하는 과정
  - 모든 확률의 합은 1

## Softmax classifier의 Tensorflow 구현

$ H_L(X) = Y $

```python
tf.matmul(X, W)+b
```


### Softmax function

$s(y_i) = \frac{e^{y_i}}{\sum_{i}{e^{y_i}}}$

```python
tf.nn.softmax(tf.matmul(X, W)+b)
```

- softmax의 입력이 되는 백터 값을 logit이라고 함

## Cost function: Cross Entropy

![](images/16/cost_function.jpg)

```python
cost = tf.reduce_mean(Y*tf.log(hypothesis),axis = 1))
optimizer = tf.train.GradientDescentOptimizer(learning_raet=0.1).minimize(cost)
```

## 코드 구현

In [33]:
import tensorflow as tf

In [34]:
x_data = [
    [1, 2, 1, 1], 
    [2, 1, 3, 2],
    [3, 1, 3, 4],
    [4, 1, 5, 5],
    [1, 7, 5, 5],
    [1, 2, 5, 6],
    [1, 6, 6, 6],
    [1, 7, 7, 7]
]

y_data = [
    [0, 0, 1],
    [0, 0, 1],
    [0, 0, 1],
    [0, 1, 0],
    [0, 1, 0],
    [0, 1, 0],
    [1, 0, 0],
    [1, 0, 0]
]

In [42]:
X = tf.placeholder('float', [None, 4])
Y = tf.placeholder('float', [None, 3])
nb_classes = 3

![](./images/16/diagram02.png)

![](./images/16/diagram03.png)

In [62]:
W = tf.Variable(tf.random_normal([4, 3]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

In [63]:
# tf.nn.softmax은 softmax를 연산
# softmax = exp(logits) / reduce_su(exp(Logits), dim)
hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)

In [64]:
#Cross Entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(hypothesis), axis=1))

In [65]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

In [74]:
# 그래프 시작
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for step in range(20001):
        sess.run(optimizer, feed_dict={X:x_data, Y:y_data})
        if step % 4000 == 0:
            print(step, sess.run(cost, feed_dict={X:x_data, Y:y_data}))
    
    print(sess.run([W, b]))
    print("Hypothesis:\n", sess.run(hypothesis, feed_dict={X:x_data, Y:y_data}))

0 4.28735
4000 0.0924706
8000 0.049332
12000 0.0334858
16000 0.0253051
20000 0.0203225
[array([[ -8.80258656,   1.67360055,   6.29956341],
       [ -1.8535434 ,  -0.25917545,   1.62591553],
       [ 11.6216116 ,  -1.63114655,  -6.47132111],
       [ -4.82830763,   3.79564261,   1.08306932]], dtype=float32), array([ -7.59788704,  -2.85889316,  10.57453918], dtype=float32)]
Hypothesis:
 [[  6.56443595e-13   6.30451041e-07   9.99999404e-01]
 [  8.19023699e-05   9.70981829e-03   9.90208268e-01]
 [  1.64007787e-16   2.13401467e-02   9.78659809e-01]
 [  1.24002796e-11   9.80948329e-01   1.90517027e-02]
 [  3.79352272e-02   9.60633039e-01   1.43173547e-03]
 [  2.01573055e-02   9.79842722e-01   7.81597986e-09]
 [  9.52186704e-01   4.78132255e-02   5.67613201e-09]
 [  9.97590423e-01   2.40955688e-03   9.88573915e-13]]


## Test & One-hot encoding

In [76]:
# 그래프 시작
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for step in range(20001):
        sess.run(optimizer, feed_dict={X:x_data, Y:y_data})
        if step % 4000 == 0:
            print(step, sess.run(cost, feed_dict={X:x_data, Y:y_data}))
    
    a = sess.run(hypothesis, feed_dict={X:[[1, 11, 7, 9]]})
    print(a, sess.run(tf.arg_max(a,1)))
    print("----------------")
    b = sess.run(hypothesis, feed_dict={X:[[1, 3, 4, 3]]})
    print(b, sess.run(tf.arg_max(b,1)))
    print("----------------")
    c = sess.run(hypothesis, feed_dict={X:[[1, 1, 0, 1]]})
    print(c, sess.run(tf.arg_max(c,1)))
    print("----------------")
    all = sess.run(hypothesis, feed_dict={X:[
        [1, 11, 7, 9],
        [1, 3, 4, 3],
        [1, 1, 0, 1]]})
    print(all, sess.run(tf.arg_max(all,1)))
    print("----------------")




0 3.88707
4000 0.0904877
8000 0.0486839
12000 0.0331624
16000 0.0251097
20000 0.0201909
[[  1.47682737e-08   1.00000000e+00   4.21667945e-09]] [1]
----------------
[[  9.99371588e-01   6.08863775e-04   1.95046359e-05]] [0]
----------------
[[  3.52540836e-19   3.60491015e-08   1.00000000e+00]] [2]
----------------
[[  1.47682737e-08   1.00000000e+00   4.21667945e-09]
 [  9.99371588e-01   6.08863775e-04   1.95046359e-05]
 [  3.52540836e-19   3.60491015e-08   1.00000000e+00]] [1 0 2]
----------------


![](./images/16/one_hot.jpg)

In [78]:
import datetime
print(datetime.datetime.now())

2017-06-05 07:19:43.827103
