# TensorFlow applications
In this notebook we explore how to
- Run a tensorflow session
- Manage multiple graphs
- Evaluate nodes
- Run models
- Use gradient descent
- Calculate gradients
    - Manually and
    - With TensorFlow's autodiff method
- Optimize gradient descent

The aim is to be able to start prjects in TensorFlow on our own when required.

## Ways to run TensorFlow sessions

In [1]:
import tensorflow as tf

Creating a computation graph

In [2]:
x = tf.Variable(3, name='x')
y = tf.Variable(4, name='y')

In [3]:
f = x*x*y + y + 2

In order to evaluate the graph we need to open a session

In [4]:
sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)

result = sess.run(f)

print(result)

sess.close()

42


Instead of repeating the session.run() all the time we can use `with`

In [5]:
with tf.Session() as sess:
    x.initializer.run() # = tf.get_default_session().run(x.initializer)
    y.initializer.run()
    result = f.eval() # = tf.get_default_session().run(f)

Neither we have to run the initializer for all the variables

In [6]:
init = tf.global_variables_initializer() # creates a node which does the initialization

with tf.Session() as sess:
    init.run() # this is where initialization happens
    result = f.eval()

Finally in jupyter we can use `InteractiveSession`. This does not require a `with` block but still need to be closed at the end.

In [7]:
sess = tf.InteractiveSession()

init.run()
result = f.eval()
print(result)

sess.close()

42


## Graph Management

New nodes are added to the default graph automatically.

In [8]:
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()

True

If we want to use multiple graphs we need to create a new one and make it the default.

In [9]:
graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)

In [10]:
x2.graph is graph

True

In [11]:
x2.graph is tf.get_default_graph()

False

## Node evaluation

TF automatically determines which node to calculate first and evaluates them in order

In [12]:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

In [13]:
sess = tf.InteractiveSession()

print(y.eval())
print(z.eval())

sess.close()

10
15


- Even though `w` preceded all the rest, TF did not reuse it but calculated in each time.
- Drops node values after runs but keeps variable values (until session close)

Efficient evaluation: does not evaluates w and x more than once

In [14]:
sess = tf.InteractiveSession()

y_val, z_val = sess.run([y, z]) # Evaluating both in one run
print(y_val)
print(z_val)

sess.close()

10
15


- Single-process TF: sessions do not share states
- Distributed TF: states are on the server, and sessions can share variables

## Linear regression

In [15]:
import numpy as np

In [16]:
# Fetching from datasets
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

Running the Normal Equation with TF (instead of Numpy) will automatically use the GPU

In [17]:
X = tf. constant( housing_data_plus_bias, dtype= tf.float32, name= "X")
y = tf. constant( housing. target. reshape(-1, 1 ), dtype= tf.float32, name= "y")

XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(
    tf.matrix_inverse(tf.matmul(XT, X)),
    XT),y)

In [18]:
with tf.Session() as sess:
    theta_value = theta.eval()

## Gradient descent

Here we will use Batch Gradient Descent instead of the Normal Eq.
First we do it 'manually'

We scale the data before gradient descent, to make it faster

In [19]:
from sklearn.preprocessing import StandardScaler

In [20]:
scaler = StandardScaler()
scaled_dat = scaler.fit_transform(housing_data_plus_bias)

### Setting the parameters

In [21]:
n_epochs = 1000
learning_rate = 0.01

In [22]:
X = tf.constant(scaled_dat, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')

### Calculating the theta and predictions
`random_uniform()` creates a node which genrates a tensor of random variables

In [23]:
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')

Mean squared error

In [24]:
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')

### Calculating gradient steps

`assign()` creates a node which assigns a new value to a variable

In [25]:
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)

### Running the model

In [26]:
init = tf.global_variables_initializer()

In [27]:
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, 'MSE = ', mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

Epoch 0 MSE =  8.592235
Epoch 100 MSE =  5.1549826
Epoch 200 MSE =  5.0322986
Epoch 300 MSE =  4.9699297
Epoch 400 MSE =  4.925574
Epoch 500 MSE =  4.893282
Epoch 600 MSE =  4.8696885
Epoch 700 MSE =  4.8524218
Epoch 800 MSE =  4.839764
Epoch 900 MSE =  4.8304553


## Using autodiff
The above code would be very painful for deep neural network applications. Insted we can use TF's autodiff which calculates gradients effectively.

In [28]:
gradients = tf.gradients(mse, [theta])[0]
training_op = tf.assign(theta, theta - learning_rate * gradients)

In [29]:
init = tf.global_variables_initializer()

In [30]:
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, 'MSE = ', mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

Epoch 0 MSE =  8.610222
Epoch 100 MSE =  4.914713
Epoch 200 MSE =  4.8465962
Epoch 300 MSE =  4.834479
Epoch 400 MSE =  4.826292
Epoch 500 MSE =  4.8203273
Epoch 600 MSE =  4.8159437
Epoch 700 MSE =  4.8127193
Epoch 800 MSE =  4.8103423
Epoch 900 MSE =  4.808587


## Automatic gradient calculation methods
- Numerical differentiation
- Symbolic dfferentiation
- Forward-mode autodiff
- Reverse-mode autodiff (this is the one TF uses)

## Automatic Optimization
Besides the automnatic gradient calculation, TF also provides different optimizer functions.

In [31]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

In [32]:
init = tf.global_variables_initializer()

In [33]:
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, 'MSE = ', mse.eval())
        sess.run(training_op)
    
    best_theta = theta.eval()

Epoch 0 MSE =  11.116577
Epoch 100 MSE =  4.8948646
Epoch 200 MSE =  4.840685
Epoch 300 MSE =  4.831991
Epoch 400 MSE =  4.82605
Epoch 500 MSE =  4.8214417
Epoch 600 MSE =  4.8178387
Epoch 700 MSE =  4.8149915
Epoch 800 MSE =  4.8127313
Epoch 900 MSE =  4.810933


## Conclusion
We looked through the most important functions in TensorFlow required to start a project in it.