# Linear Regression
- Linear Regression is the simplest form of modeling for a supervised learning problem.
- Given a set of data points as training data, we find the linear function that best fits them.

- The general formula for a linear function is:
$$y(x_1, x_2, x_3, ..., x_k) = w_1x_1+w_2x_2+w_3x_3+...+w_kx_k+b$$
- Its matrix(or tensor) form:
$$Y=XW^T  + b$$ where $X=(x_1,...,x_k)$ and $W=(w_1,...,w_k)$

In [1]:
import tensorflow as tf

In [2]:
# Initialize variables/model parameters
W = tf.Variable(tf.zeros([2, 1]), name="weights")
b = tf.Variable(0., name="bias")

In [3]:
def inference(X):
    return tf.matmul(X, W) + b

- Time to define how to compute the loss. For this simple model we will use a squared error, which sums the squared difference of all the predicted values for each training example with their corresponding expected values. It's also known as **L2 norm** or **L2 loss function**.

$$loss = \sum_i(y_i - ypredicted_i)^2$$

In [4]:
def loss(X,Y):
    Y_predicted = inference(X)
    return tf.reduce_sum(tf.squared_difference(Y, Y_predicted))

# Dataset
We are going to work with a dataset that relates age in years and weight in kilograms with blood fat content (http://people.sc.fsu.edu/~jburkardt/datasets/regression/x09.txt)

In [5]:
def inputs():
    weight_age = [[84, 46], [73, 20], [65, 52], 
                  [70, 30], [76, 57], [69, 25],
                  [63, 28], [72, 36], [79, 57],
                  [75, 44]]
    
    blood_fat_content = [354, 190, 405, 263, 451, 302,
                         288, 385, 402, 365]
    
    return tf.to_float(weight_age), tf.to_float(blood_fat_content)
    

In [6]:
def train(total_loss):
    learning_rate = 0.000001
    return tf.train.GradientDescentOptimizer(learning_rate)\
                   .minimize(total_loss)

In [8]:
def evaluate(sess, X, Y):
    print sess.run(inference([[71., 35.]])) # ~ 263
    print sess.run(inference([[70., 23.]])) # ~ 302

In [13]:
# Launch the graph in a session
with tf.Session() as sess:
    tf.global_variables_initializer().run()
    
    X, Y = inputs()
    total_loss = loss(X, Y)
    train_op = train(total_loss)
    
    initial_step = 0
    training_steps = 100
    for step in range(initial_step, training_steps):
        sess.run([train_op])
        if step % 10 == 0:
            print "loss: ", sess.run([total_loss])
            
    evaluate(sess, X, Y)
    sess.close()

loss:  [2413308.5]
loss:  [686254.88]
loss:  [667614.25]
loss:  [655491.19]
loss:  [647606.69]
loss:  [642478.69]
loss:  [639143.06]
loss:  [636973.25]
loss:  [635561.38]
loss:  [634642.69]
[[ 331.31930542]]
[[ 329.05212402]]
