# Lecture 4. Multi-variable linear regression
---

## Recap
- Hypothesis <br />
$H(x) = Wx + b$


- Cost function <br />
$cost(W, b) = \frac{1}{m} \sum(H(x_i) - y_i)^2$


- Gradient descent algorithm

## Predicting axam score:
- regression using one input ( X )
 - one-variable ( one-feature )
   - X: study time


- regression using two inputs ( X1, X2 )
 - multi-variable ( multi-feature )
   - X1: study time
   - X2: attendance

##  Hypothesis

$$
\begin{align}
H(x) &= Wx + b \\
H(x_1, x_2) &= w_1 x_1 + w_2 x_2 + b
\end{align}
$$

##  Cost function

$$
\begin{align}
H(x_1, x_2) &= w_1 x_1 + w_2 x_2 + b \\
cost(W, b) &= \frac{1}{m} \sum(H(x_i) - y_i)^2
\end{align}
$$

## Multi-variable
$$H(x_1, x_2, x_3, \cdots , x_n) = w_1 x_1 + w_2 x_2 + \cdots + x_n + b$$

### Matrix
- dot product

$$\begin{bmatrix}  w_1& w_2 & w_3 \end{bmatrix}\times \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} w_1x_1 + w_2x_2 + w_3x_3 + \cdots + w_nx_n \end{bmatrix}$$

### Hypothesis
$$H(X) = WX + b$$

### Hypothesis without $b$

$$
H(X) = W \cdot X = \begin{bmatrix} b & w_1& w_2 & w_3 \end{bmatrix}\times \begin{bmatrix} 1 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix}
$$

### Hypothesis using Transpose
- 그러나 보통 벡터는 세로로 표기한다.
$$W = \begin{bmatrix} b \\ w_1 \\ w_2 \\ w_3 \end{bmatrix}$$

$$H(X) = W^T \cdot X = \begin{bmatrix} b \\ w_1 \\ w_2 \\ w_3 \end{bmatrix} ^ T\times \begin{bmatrix} 1 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix}$$


# Lab 4. multi-variable linear regression을 TensorFlow에서 구현하기
---

In [1]:
import tensorflow as tf

x1_data = [1, 0, 3, 0, 5]
x2_data = [0, 2, 0, 4, 0]
y_data = [1, 2, 3, 4, 5]


W1 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
W2 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

b = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# hypothesis
hypothesis = W1 * x1_data + W2 * x2_data + b

# cost funtion
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

In [2]:
# Minimize
a = tf.Variable(0.1) # Learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

In [3]:
# Before starting, initialize the variables
init = tf.global_variables_initializer()

# Launch the graph
sess = tf.Session()
sess.run(init)

In [4]:
# Fit the line
for step in range(701):
    sess.run(train)
    if step % 20 == 0:
        print (step, sess.run(cost), sess.run(W1), sess.run(W2), sess.run(b))

0 7.79911 [ 1.8252883] [ 0.81575698] [ 0.85557151]
20 0.0183498 [ 0.91558504] [ 0.89984107] [ 0.3211306]
40 0.00531305 [ 0.95457649] [ 0.94610536] [ 0.17279722]
60 0.00153835 [ 0.97555804] [ 0.97099978] [ 0.09298057]
80 0.000445417 [ 0.986848] [ 0.98439521] [ 0.05003196]
100 0.000128968 [ 0.99292308] [ 0.9916032] [ 0.02692176]
120 3.73412e-05 [ 0.99619198] [ 0.99548179] [ 0.01448635]
140 1.08118e-05 [ 0.99795091] [ 0.99756879] [ 0.00779495]
160 3.13069e-06 [ 0.99889749] [ 0.9986918] [ 0.00419442]
180 9.06467e-07 [ 0.99940681] [ 0.99929607] [ 0.00225699]
200 2.62475e-07 [ 0.99968076] [ 0.99962121] [ 0.00121447]
220 7.59972e-08 [ 0.99982816] [ 0.99979621] [ 0.00065348]
240 2.19949e-08 [ 0.99990755] [ 0.99989033] [ 0.00035168]
260 6.37517e-09 [ 0.99995023] [ 0.99994099] [ 0.00018924]
280 1.84687e-09 [ 0.99997324] [ 0.99996829] [ 0.00010178]
300 5.34385e-10 [ 0.99998569] [ 0.99998295] [  5.47844793e-05]
320 1.53591e-10 [ 0.99999225] [ 0.99999082] [  2.94691854e-05]
340 4.45567e-11 [ 0.9999

## Matrix form with $b$

In [5]:
x_data = [[1., 0., 3., 0., 5.],
          [0., 2., 0., 4., 0.]]
y_data = [1, 2, 3, 4, 5]

W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
b = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# Hypothesis
hypothesis = tf.matmul(W, x_data) + b # matrix multiplication

# 이 밑으로 Same
# Cost func
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1)
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# init
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Fitting
for step in range(701):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W), sess.run(b))

0 8.96695 [[ 2.01941442  0.98389363]] [ 0.41775191]
20 0.00090749 [[ 0.98122841  0.97772628]] [ 0.0714149]
40 0.000262758 [[ 0.9898985   0.98801464]] [ 0.0384275]
60 7.60797e-05 [[ 0.99456447  0.99355078]] [ 0.02067753]
80 2.20277e-05 [[ 0.99707514  0.99652976]] [ 0.01112636]
100 6.37793e-06 [[ 0.99842614  0.99813265]] [ 0.00598696]
120 1.84673e-06 [[ 0.99915314  0.99899524]] [ 0.00322155]
140 5.34699e-07 [[ 0.99954432  0.99945933]] [ 0.00173349]
160 1.54846e-07 [[ 0.99975473  0.99970907]] [ 0.00093274]
180 4.48044e-08 [[ 0.99986809  0.99984348]] [ 0.0005019]
200 1.29916e-08 [[ 0.99992907  0.99991578]] [ 0.0002701]
220 3.76073e-09 [[ 0.99996179  0.99995464]] [ 0.00014535]
240 1.09193e-09 [[ 0.9999795   0.99997562]] [  7.82720817e-05]
260 3.17735e-10 [[ 0.99998897  0.99998689]] [  4.20848955e-05]
280 8.99149e-11 [[ 0.99999404  0.99999297]] [  2.26537795e-05]
300 2.70717e-11 [[ 0.99999678  0.99999619]] [  1.21538287e-05]
320 7.25322e-12 [[ 0.99999827  0.99999797]] [  6.54622318e-06]
340 

## Matrix form without $b$

In [6]:
x_data = [[1, 1, 1, 1, 1],
          [1., 0., 3., 0., 5.],
          [0., 2., 0., 4., 0.]]
y_data = [1, 2, 3, 4, 5]

W = tf.Variable(tf.random_uniform([1, 3], -1.0, 1.0))

# Hypothesis
hypothesis = tf.matmul(W, x_data) # matrix multiplication without b

# 이 밑으로 Same
# Cost func
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1)
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# init
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Fitting
for step in range(701):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W))

0 0.807682 [[ 0.01179713  1.32805276  0.89604843]]
20 6.68572e-05 [[-0.01938367  1.00509572  1.0060457 ]]
40 1.93572e-05 [[-0.01043018  1.00274181  1.0032531 ]]
60 5.60484e-06 [[-0.00561236  1.00147533  1.00175047]]
80 1.62295e-06 [[-0.00301994  1.00079393  1.00094199]]
100 4.69942e-07 [[-0.00162507  1.00042713  1.00050688]]
120 1.36049e-07 [[ -8.74443213e-04   1.00022984e+00   1.00027275e+00]]
140 3.93617e-08 [[ -4.70500003e-04   1.00012362e+00   1.00014675e+00]]
160 1.14116e-08 [[ -2.53152772e-04   1.00006652e+00   1.00007904e+00]]
180 3.29636e-09 [[ -1.36225150e-04   1.00003576e+00   1.00004256e+00]]
200 9.53372e-10 [[ -7.33184061e-05   1.00001931e+00   1.00002289e+00]]
220 2.75785e-10 [[ -3.94248164e-05   1.00001037e+00   1.00001228e+00]]
240 8.20393e-11 [[ -2.12167888e-05   1.00000560e+00   1.00000668e+00]]
260 2.30784e-11 [[ -1.14249369e-05   1.00000310e+00   1.00000358e+00]]
280 6.68763e-12 [[ -6.17257456e-06   1.00000155e+00   1.00000191e+00]]
300 2.08331e-12 [[ -3.32585705e-06

## Loading data from file
- `./data/train.txt`

In [7]:
import tensorflow as tf
import numpy as np

# loading data from .txt file
xy = np.loadtxt('./data/train.txt', unpack=True, dtype='float32')
x_data = xy[0: -1]
y_data = xy[-1]

print(x_data)
print(y_data)

[[ 1.  1.  1.  1.  1.]
 [ 1.  0.  3.  0.  5.]
 [ 0.  2.  0.  4.  0.]]
[ 1.  2.  3.  4.  5.]


In [8]:
W = tf.Variable(tf.random_uniform([1, len(x_data)], -5.0, 5.0))

# hypothesis
hypothesis = tf.matmul(W, x_data)

# cost func
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1) # learning rate
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# init
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Fitting
for step in range(701):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W))

0 17.1112 [[ 2.72396231 -1.16091442 -0.28251207]]
20 0.516211 [[ 1.70325065  0.55226159  0.46876365]]
40 0.149465 [[ 0.91650343  0.75907713  0.71414685]]
60 0.0432762 [[ 0.49316201  0.87036163  0.84618503]]
80 0.0125303 [[ 0.2653659   0.93024284  0.91723359]]
100 0.00362803 [[ 0.1427909   0.96246421  0.95546418]]
120 0.00105046 [[ 0.07683446  0.97980231  0.97603571]]
140 0.000304153 [[ 0.04134394  0.98913193  0.98710501]]
160 8.80651e-05 [[ 0.02224681  0.99415201  0.99306136]]
180 2.54985e-05 [[ 0.01197078  0.99685317  0.99626637]]
200 7.38272e-06 [[ 0.00644138  0.99830663  0.99799091]]
220 2.13757e-06 [[ 0.00346609  0.99908894  0.99891895]]
240 6.19016e-07 [[ 0.00186507  0.99950975  0.99941832]]
260 1.79199e-07 [[ 0.00100356  0.99973613  0.99968696]]
280 5.1887e-08 [[  5.40030131e-04   9.99858141e-01   9.99831557e-01]]
300 1.50363e-08 [[  2.90553726e-04   9.99923646e-01   9.99909401e-01]]
320 4.35173e-09 [[  1.56376540e-04   9.99958932e-01   9.99951243e-01]]
340 1.26054e-09 [[  8.4149