# Chapter 9
## Up and Running with Tensorflow
Refer to the book for other libraries and why tensorflow is sooo cooool! Anywho, hello tensorflow!

In [1]:
import tensorflow as tf

x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

And thats it! Do note, **this does not perform any computation, even though it looks like it does**, it just creates a computation graph. Even the variables are not initialized yet! To evaluate this, you need to start a tf session and use it to initialize the varables and evaulate f.

In [2]:
with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval

While this works, it's really tedious... Here's a simpler version of this!!!

In [3]:
init = tf.global_variables_initializer()

with tf.Session() as sess:
    init.run()
    result = f.eval()

Note that a tensorflow program is split into two parts: the first part builds the computation graph (called the _construction phase_), and the second part runs it (called the _execution phase_).

## Managing Graphs
Anything made gets added to the default graph,but you can manage multiple independant graphs too!

In [4]:
graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)
    
print(x2.graph is graph)
print(x2.graph is tf.get_default_graph())

True
False


Something very important to note is that all node values are dropped between graph runs, except variable values. Here is an efficient example:

In [6]:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

with tf.Session() as sess:
    y_val, z_val = sess.run([y, z])
    print(y_val)
    print(z_val)

10
15


## Linear Regression with TensorFlow
Heres an example

In [9]:
import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
    theta_value = theta.eval()

## Implementing Gradient Descent
We'll try and implement Batch Gradient Descent! First we will do this manually, then use autodiff, then use TensorFlow's out-of-the-box optimizers!

**NOTE:** When performing Gradient Descent, be sure to normalize the input, otherwise training will take much longer!!!

### Manually Computing the Gradients
Here's a breif run down:
- `random_uniform()` creates a node in the graph that will generate a tensor containing random values.
- `assign()` creaes a node that will assign a new value to a variable
- The main loop executes training repeatedly and every 100 iters it prints out the MSE.

In [11]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaled_housing_data_plus_bias = scaler.fit_transform(housing_data_plus_bias)

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE = ", mse.eval())
        sess.run(training_op)
        
    best_theta = theta.eval()

Epoch 0 MSE =  5.84069
Epoch 100 MSE =  4.9645343
Epoch 200 MSE =  4.9123673
Epoch 300 MSE =  4.8822002
Epoch 400 MSE =  4.8605113
Epoch 500 MSE =  4.8448167
Epoch 600 MSE =  4.833452
Epoch 700 MSE =  4.8252196
Epoch 800 MSE =  4.81925
Epoch 900 MSE =  4.814919


### Using autodiff
While it's fairly easy to do the following, it becomes a real hassle to do that every single time, plus this leads to more error prone code! We could use _symbolic differnetiation_ to automatically find the equations for the partial derivatives, but the resulting code would not be efficient.

Thankfully TensorFlow can automatically and efficiently compute the gradients for us. Just replace the `gradients` code above with this:
```python
gradients = tf.gradients(mse, [theta])[0]
```

### Using an Optimizer
It gets even easier, just swap out `gradients` and `training_op` with the following:
```python
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
```
If you want to use another optimizer, you just need to change one line! For example, we can use a momentum optimizer(See Chapter 11)
```python
optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,
                                       momentum=0.9)
```

## Feeding Data to the Training Algorithm
Now we're going to implement Mini-batch Gradient Descent. To do this X and y will utilize special nodes (placeholders). To create one just use `placeholder()` and specify the type and optinally the shape.

To implement Mini-batch Gradient Descent we can tweak the existing code sligtly like so.

In [13]:
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

batch_size = 100
n_batches = int(np.ceil(m / batch_size))

def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch, y_batch

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
    best_theta = theta.eval()

## Saving and Restoring Models
Once you trained your model, you should save it so you can come back to it whenever you want! Moreover, save friggin checkpoints when training! Saving tensorflow models is super easy, just create a `Saver` node at the end of the construction phase and call its `save()` method to save the model.

```python
[...]
init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            save_path = saver.save(sess, "/tmp/my_model.ckpt")
        
        sess.run(training_op)
        
    best_theta = theta.eval()
    save_path = saver.save(sess, "/tmp/my_model_final.ckpt")
```

Restoring the model is easy, make a `Saver` at the end of the constuction phase just like before, and at the beginning of the execution phase, do the following `restore()` frome the `Saver` object:

```python
with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_model_final.ckpt")
    [...]
```
By default `Saver` saves and restores all variables under their own name, but if more control is needed, you can specify which variables to save or restore, and what names to use! Example here
```python
saver = tf.train.Saver({"weights": theta})
```

## Visualizing the Graph and Training Curves Using TensorBoard

After doin all of that magics, you're going to want to start properly logging your data! To do so, take note of the data using classic date and time stuffs as follows:

```python
from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)
```

After constructing your model, but before running it, add the following lines:

```python
mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
```

Now you need to remeber to log data back every once and a while like so:
```python
[...]
for batch_index in range(n_batches):
    X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
    if batch_index % 10 == 0:
        summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
        step = epoch * n_batches + batch_index
        file_writer.add_summary(summary_str, step)
    sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
[...]
```

And at the end of the program, don't forget to call `file_writer.close()`!

Start tensorboard by using the following:
```bash
$ tenorboard --logdir tf_logs/
```
If you want to look at it in Jupyter, call the `show_graph()` function!

### Name Scopes
To avoid clutter in the future with NN, create namescopes as follows:
```python
with tf.name_scope("loss") as scope:
    error = y_pred - y
    mse = tf.reduce_mean(tf.square(error), name="mse")
```

## Modularity
Rather than repeat yourself if you want to create multiple instances of the same model, tensorflow lets you stay DRY by letting you create functions to build your model! Example as follows:

```python
def relu(x):
    w_shape = (int(X.get_shape()[1]), 1)
    w = tf.Variable(tf.random_normal(w_shape), name="weights")
    b = tf.Variable(0.0, name="bias")
    z = tf.add(tf.matmul(X, w), b, name="z")
    return tf.maximum(z, 0., name="relu")
    
n_features = 3
X = tf.placeholder(tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")
```

To format the graph on Tensorboard even nicer, do the following and add a `name_scope()`:

```python
def relu(X):
    with tf.name_scope("relu"):
        [...]
```

## Sharing Variables
One way of sharing variables in TensorFlow is to simply definite before used and pass it to every funcition that uses it. This tends to be rather tedious thought. Some people make a dicitonary of items and pass those instead. Others make classes for each module. Or you can set the shared variable of the `relu()` function as an attribute like so:

```python
def relu(x)
    with tf.name_scope("relu"):
        if not hasattr(relu, "threshold"):
            relu.threshold = tf.Variable(0.0, name="threshold")
        [...]
        return tf.maximum(z, relu.threshold, name="max")
```

Another method is to use the `get_variable()` tf method like so:

```python
with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(),
        initializer=tf.constant_initializer(0.0))
```

If you want to reuse the variable, you must explicitly say so:

```python
with tf.variable_scope("relu", reuse=true):
    threshold = tf.get_variable("threshold")
```

