# Binary Classification with logistic function
- https://ko.wikipedia.org/wiki/로지스틱_회귀
- https://www.youtube.com/watch?v=6vzchGYEJBc&list=PLlMkM4tgfjnLSOjrEJN31gZATbcj_MpUm&index=11&spfreload=1
- sigmoid function: $ y = {1}/({1+\exp({-x})})$
- a linear regression or any other function can be fed into sigmoid to result in a value in the range of $[0,1]$.
- binary classification
- Logistic Hypothesis model: $ H(x) = 1 / (1 + \exp(- W x ) )$
- cost functon: $ cost(W) = \frac{1}{m} \sum_{i=1}^m (H(x^i) - y^i)^2 $. This is highly non-linear. No good for SGD optimization.
- Instead, use $C(H(x), y|W) = -\log(H(x))$ if $y=1$, else if $y=0$, $C(H(x),y) = -\log(1-H(x))$. This results in the following cost function:

$cost = \frac{1}{m}\sum C(H(x), y) = \frac{1}{m}\sum -y\log(H(x)) - (1-y)\log(1-H(x))$

### cost function in tf
`` cost = tf.reduce_mean(-tf.reduce_sum (Y*tf.log(H) + (1-Y)*tf.log(1-H)))``
### minimize
-``a=tf.Variable(0.1)``

-``opt = tf.train.GradientDescentOptimizer(a)``

-``train = opt.minimize(cost)``

In [3]:
import numpy as np
import tensorflow as tf

In [21]:
xy = np.loadtxt('train-logistic-r.txt', unpack=True, dtype='float32')
xdata = xy[0:-1]
ydata = xy[-1]

X=tf.placeholder(tf.float32)
Y=tf.placeholder(tf.float32)
W=tf.Variable(tf.random_uniform([1,len(xdata)],-1.,1.))

#
h = tf.matmul(W , X)
hypo = tf.div(1., 1.+tf.exp(-h))
#cost
cost = - tf.reduce_mean( Y*tf.log(hypo) + (1.-Y)*tf.log(1.-hypo) )

#minimize
a = tf.Variable(0.1) # learning rate
optimizer = tf.train.GradientDescentOptimizer(a)
trainer = optimizer.minimize(cost)

#
ss = tf.Session(); 
ss.run(tf.global_variables_initializer())

# train!
maxiter=2001
for step in range(maxiter):
    ss.run (trainer, feed_dict={X:xdata, Y:ydata})
    if step%100 == 0:
        print('{}/{} '.format(step, maxiter), 
             ss.run(cost, feed_dict={X:xdata, Y:ydata}),
             ss.run(W))
print ('Finished Learning.')

0/2001  1.96969 [[ 0.88889956 -0.48498845 -0.45460814]]
100/2001  0.56253 [[-0.44403985 -0.16196413  0.4294897 ]]
200/2001  0.447474 [[-1.49681008 -0.07871286  0.59690756]]
300/2001  0.381553 [[-2.29517007 -0.00916002  0.71698141]]
400/2001  0.339596 [[-2.93226719  0.04101916  0.81775784]]
500/2001  0.31052 [[-3.46255231  0.07795615  0.90590245]]
600/2001  0.289028 [[-3.91833472  0.1060717   0.98470694]]
700/2001  0.272344 [[-4.31974936  0.12809558  1.05631089]]
800/2001  0.258897 [[-4.67998981  0.14574523  1.12221396]]
900/2001  0.247733 [[-5.00809193  0.16014707  1.18350482]]
1000/2001  0.238243 [[-5.31048346  0.17207024  1.24099517]]
1100/2001  0.23002 [[-5.59186983  0.18205783  1.29530132]]
1200/2001  0.222781 [[-5.85580206  0.19050591  1.34690356]]
1300/2001  0.216322 [[-6.10500956  0.19770892  1.39618087]]
1400/2001  0.210496 [[-6.3416338   0.20389302  1.44343603]]
1500/2001  0.20519 [[-6.56738329  0.20923269  1.48891592]]
1600/2001  0.200319 [[-6.78364277  0.21386638  1.53282344

In [25]:
# run the classifier learned using some test input data
# x1 = 4 시간 공부 x2 = 3번 수업참석 ...
# Four test cases in total, 
xtest = np.array([[1,2,2],
                  [1,5,5], 
                  [1, 4, 3], 
                  [1, 3, 5]]).transpose()
print ('test data: ')
print (xtest)

# compute the predicted probability
prob = ss.run(hypo, feed_dict={X: xtest}) # logistic function = cross entropy for binary!
print ('test result probability: ', prob)
print ('test result True/False: ', prob>0.5)


test data: 
[[1 1 1 1]
 [2 5 4 3]
 [2 5 3 5]]
test result probability:  [[ 0.02348239  0.88505471  0.1711487   0.83014882]]
test result True/False:  [[False  True False  True]]
