# Getting started with TensorFlow - II

_Notes compiled from Chapter 9: Hands-On Machine Learning with Scikit-Learn and TensorFlow_

## Implementing Gradient Descent - Manual Computation vs Autodiff

We will use the California Housing Dataset as an example in this notebook.

In [12]:
import numpy as np
import tensorflow as tf
from sklearn.datasets import fetch_california_housing
from sklearn.preprocessing import StandardScaler

housing = fetch_california_housing()
m, n = housing.data.shape

housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

scaler = StandardScaler()
scaled_housing_data_plus_bias = scaler.fit_transform(housing_data_plus_bias)

n_epoochs = 1500
learning_rate = 0.01

*  **Manually Computing the Gradients:**

For linear regression, Mean Square Error = MSE(X, $h_\theta$) = $\frac{1}{m}$ $\sum_{i=1}^{m} (\theta^{T}X^{(i)} - y^{(i)})^{2}$

Gradient vector of the cost function = $\frac{2}{m}$ $X^{T}$ (X $\theta$ - y)

In [15]:
tf.reset_default_graph()

# Creating the computational graph

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")

theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
manual_gradients = (2/m)*tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * manual_gradients)

init = tf.global_variables_initializer()

# Executing the operations in the graph

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epoochs):
        if epoch%100 == 0:
            print("Epoch: ", epoch, "  MSE= ", mse.eval())
        sess.run(training_op)

    opt_theta = theta.eval()
    # print(opt_theta)

Epoch:  0   MSE=  11.195537
Epoch:  100   MSE=  4.973089
Epoch:  200   MSE=  4.867675
Epoch:  300   MSE=  4.8479214
Epoch:  400   MSE=  4.8356853
Epoch:  500   MSE=  4.82692
Epoch:  600   MSE=  4.8205585
Epoch:  700   MSE=  4.815932
Epoch:  800   MSE=  4.8125615
Epoch:  900   MSE=  4.8101034
Epoch:  1000   MSE=  4.8083076
Epoch:  1100   MSE=  4.8069935
Epoch:  1200   MSE=  4.80603
Epoch:  1300   MSE=  4.8053217
Epoch:  1400   MSE=  4.8047996


*  **Using Autodiff for Computing the Gradients:**

Tensorflow uses reverse-mode autodiff that takes ($n_outputs$ + 1) number of graph traversals to compute all gradients. 

The _gradients()_ function takes an op (for eg: mse) and a list of variables (for eg: theta), and creates a list of ops (one per variable) to compute the gradients of the op with regard to each variable. In the below snippet, the autodiff_gradients node computes the gradient vector of the MSE with regard to theta.

In [18]:
tf.reset_default_graph()

# Creating the computational graph

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")

theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
autodiff_gradients = tf.gradients(mse, [theta])[0]
training_op = tf.assign(theta, theta - learning_rate * autodiff_gradients)

init = tf.global_variables_initializer()

# Executing the operations in the graph

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epoochs):
        if epoch%100 == 0:
            print("Epoch: ", epoch, "  MSE= ", mse.eval())
        sess.run(training_op)

    opt_theta = theta.eval()
    # print(opt_theta)

Epoch:  0   MSE=  10.54808
Epoch:  100   MSE=  4.9338737
Epoch:  200   MSE=  4.871462
Epoch:  300   MSE=  4.85152
Epoch:  400   MSE=  4.8384757
Epoch:  500   MSE=  4.8290787
Epoch:  600   MSE=  4.822237
Epoch:  700   MSE=  4.8172426
Epoch:  800   MSE=  4.8135915
Epoch:  900   MSE=  4.8109164
Epoch:  1000   MSE=  4.808952
Epoch:  1100   MSE=  4.807507
Epoch:  1200   MSE=  4.806441
Epoch:  1300   MSE=  4.805652
Epoch:  1400   MSE=  4.805066


## Visualizing the graph using TensorBoard

In [20]:
from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

tf.reset_default_graph()


# Creating the computational graph

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")

theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
autodiff_gradients = tf.gradients(mse, [theta])[0]
training_op = tf.assign(theta, theta - learning_rate * autodiff_gradients)

init = tf.global_variables_initializer()

file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

# Executing the operations in the graph
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epoochs):
        if epoch%100 == 0:
            print("Epoch: ", epoch, "  MSE= ", mse.eval())
        sess.run(training_op)

    opt_theta = theta.eval()

file_writer.close()

Epoch:  0   MSE=  8.291836
Epoch:  100   MSE=  5.0304093
Epoch:  200   MSE=  4.9413304
Epoch:  300   MSE=  4.904329
Epoch:  400   MSE=  4.877926
Epoch:  500   MSE=  4.8586082
Epoch:  600   MSE=  4.844426
Epoch:  700   MSE=  4.833989
Epoch:  800   MSE=  4.8262877
Epoch:  900   MSE=  4.8205886
Epoch:  1000   MSE=  4.8163576
Epoch:  1100   MSE=  4.8132057
Epoch:  1200   MSE=  4.8108487
Epoch:  1300   MSE=  4.809079
Epoch:  1400   MSE=  4.807745


![alt text](TensorboardGraphVis.png "Title")