# Lab 05 Logistic Classification (diabetes) - Eager Execution
* Logistic Classfication을 diabetes data를 활용하여 모델을 만들어 보도록 하겠습니다
### 기본 Library 선언 및 Tensorflow 버전 확인

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf

tf.random.set_seed(777)  # for reproducibility
print(tf.__version__)

2.3.0


#### Data

In [123]:
xy = np.loadtxt('data/data-03-diabetes.csv', delimiter = ',', dtype=np.float32)
x_train = xy[:, 0:-1]
y_train = xy[:, [-1]]

print(x_train.shape, y_train.shape)

(759, 8) (759, 1)


### Tensorflow Eager

- Tensorflow data API를 통해 학습시킬 값들을 담는다 (Batch Size는 한번에 학습시킬 Size로 정한다)

In [124]:
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(len(x_train))

- W와 b은 학습을 통해 생성되는 모델에 쓰이는 Wegith와 Bias (초기값을 variable : 0이나 Random값으로 가능 tf.random_normal([2, 1]) )

In [125]:
W = tf.Variable(tf.random.normal((8,1)), name = 'weight')
b = tf.Variable(tf.random.normal((1,)), name = 'bias')

- Sigmoid 함수를 가설로 선언
$$ sigmoid(x) = \frac{1}{1+e^{-x}} $$

In [126]:
def logistic_regression(features):
    hypothesis  = tf.divide(1., 1. + tf.exp(tf.matmul(features, W) + b))
    return hypothesis

### Cost 함수 정의

- $$ cost(h(x),y) = -log(h(x)) \qquad if \quad y = 1$$
- $$ cost(h(x),y) = -log(1 - h(x)) \qquad if \quad y = 0$$

> - 두 수식을 합치면 다음과 같다.

- $$ cost(h(x),y) = -ylog(h(x)) - (1-y)(log(1-h(x))) $$
- $$ \downarrow $$
- $$ cost(h(x),y) = -(ylog(h(x)) + (1-y)(log(1-h(x)))) $$

In [132]:
def loss_func(hypothesis, labels):
    cost = -tf.reduce_mean(labels * tf.math.log(hypothesis) + (1- labels) * tf.math.log((1 - hypothesis)))
    return cost

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

In [133]:
def accuracy_func(hypothesis, labels):
    predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
    accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, labels), dtype=tf.int32))
    return accuracy

- GradientTape를 통해 경사값을 계산

In [134]:
def grad(features,labels):
    with tf.GradientTape() as tape:
        loss_value = loss_func(logistic_regression(features),labels)
    return tape.gradient(loss_value, [W,b])

In [135]:
EPOCHS = 1001

for step in range(EPOCHS):
    for features, labels in iter(dataset):
        grads = grad(features, labels)
        optimizer.apply_gradients(grads_and_vars=zip(grads, [W,b]))
        if step % 100 == 0:
            print("Iter: {}, Loss: {:.4f}".format(step, loss_func(logistic_regression(features),labels)))

Iter: 0, Loss: 0.5657
Iter: 100, Loss: 0.5618
Iter: 200, Loss: 0.5582
Iter: 300, Loss: 0.5547
Iter: 400, Loss: 0.5515
Iter: 500, Loss: 0.5484
Iter: 600, Loss: 0.5455
Iter: 700, Loss: 0.5428
Iter: 800, Loss: 0.5402
Iter: 900, Loss: 0.5377
Iter: 1000, Loss: 0.5354
