# CSCI 1470 Lab 02: Introduction to Tensorflow #

In this lab, we will be introducing [Tensorflow](http://tensorflow.org/), a cutting-edge library for developing, and evaluating deep neural network models.

Tensorflow is a relatively new library (a couple of years old), and the developers at Google just released version 1.3 a couple of weeks ago. For this class, **assume all labs and projects will utilize Tensorflow Version 1.3**. 

The next couple of cells walk through the Tensorflow installation process, with all the information necessary to get Tensorflow working on both the department machines, as well as your local computers.

We then walk through how Tensorflow works, and some of the basic functionality. There will be 3 check-off questions that all require you and your partner to write code, for the purpose of gaining familiarity with the library and the API. 

**Make sure to get all 3 questions checked off by your TA to get credit for this lab!**

## Installing Tensorflow ##

##### If Using Local Machine #####

Please follow the instructions at this webpage to install Tensorflow on your local machine: [https://www.tensorflow.org/install/](https://www.tensorflow.org/install/). 

If you run into any trouble (either during lab, or if installing at home later), please post on Piazza or come to Hours.

##### If Using Department Machine (directly, or via ssh) #####

All department machines can run Tensorflow through our cs147 virtual environment. To run Tensorflow, please run the following command (in your terminal):

```bash
source /course/cs1470/cs147-tf-cpu/bin/activate
```

The above command activates the Python Virtual Environment, meaning any subsequent invocation of python, or ipython will be able to access any libraries installed in the given environment. Virtual Environments are a nice tool for keeping your python installations organized and clean, and we highly recommend using them in this class (and beyond!).

To get the virtual environment to load in IPython Notebook, you will need to run the following commands (after activating the cs147-tf-cpu environment):

```bash
pip install --user ipykernel
python -m ipykernel install --user --name=cs147-tf-cpu
```

Then, enter/load an IPython Notebook, and change your kernel to cs147-tf-cpu using the kernel tab at the top.

### Checkoff - Import Tensorflow ###

Run the following python code, and have your Lab TA check you off if there are no errors.

In [1]:
import tensorflow as tf

print tf.__version__

1.4.0


## Tensorflow 101 - The Computation Graph ##

Writing Tensorflow code is very different than writing regular Python code. This is because the bulk of the computation in Tensorflow is executed in a separate process, containing high performing C code. However, instead of directly writing everything in C, Tensorflow defines a Python API, for both building the C computations, and for reading and writing data from/to this separate process.

Here are the things to keep in mind.

The core of Tensorflow is the **Computation Graph**. Tensorflow starts and ends with this graph - every operation you define, every intermediate value you calculate lives here.
    
  + This graph consists of nodes and edges. Nodes are **Tensors**, or matrices of varying dimensions (i.e 3D, 4D, etc.). The edges are **Operations** that take one or more tensors, and produce a new, resulting Tensor after applying a given transformation (i.e. addition, subtraction, matrix multiplication, etc.)
  + All of these Tensors and Operations exist in this separate, high performing process. That means you can't print/peek into the Tensors like you would in regular Python.
  + This graph defines a "flow" of Tensors, starting from an input, resulting in a desired output. Hence the library's name, **Tensorflow**.
  
There are two steps to every Tensorflow program:

  + Defining the graph: Use Python's Tensorflow API to set up the inputs, all the transformations, the operations to run, and the desired outputs to collect.
  + Interacting with the graph: Feed inputs into the Computation Graph, and collect outputs, transforming normal Python objects into Tensors to be read by the Computation Graph.
  
We'll be talking about how to define the Computation Graph soon. However, the crucial thing to understand is this second point - how to feed data to the computation graph, and how to read outputs back.

The device that allows us to communicate with the computation graph is the **Session.**

### Tensorflow Sessions - Portal to the Computation Graph ###

All of your interaction with the Computation Graph will happen through a **Session** object. Sessions allow you to not only load Python objects into the Graph, but they also allow you to run arbitrary operations, and fetch the values of given Tensors.

To better understand this, consider the following example:

In [2]:
import tensorflow as tf

# Define a Constant Tensor on the Computation Graph
hello = tf.constant('Hello World!')

# Print the Tensor - this is not a normal Python object - it's a Tensor!!!
print hello

# Create a Session
session = tf.Session()

# Run (Fetch) the given Tensor, and print it's value - this is a normal Python object (str)
print session.run(hello)

# Close the session
session.close()

Tensor("Const:0", shape=(), dtype=string)
Hello World!


Sessions act a lot like files in Python. They need to be opened, via the special ```tf.Session()``` constructor, and assigned to a variable. Then, to read a Tensor value from a session, you need to call ```session.run(val)``` to evaluate the set of operations resulting in the given Tensor. You must then close the session, to end the interaction.

You can also *feed* values into a Computation Graph, via **feed_dicts** (feed dictionaries). These dictionaries provide a mapping between special Tensor objects called placeholders, that denote inputs coming from normal Python, and actual raw python objects.

An example is as follows:

In [3]:
# Define a normal Python string
hello = "Hello World!"

# Define a Placeholder Tensor on the Computation Graph - note that you have to define the type of a placeholder!
string_tensor = tf.placeholder(dtype=tf.string)

# Print the String Tensor - See what it looks like!
print string_tensor

# Create a Session
session = tf.Session()

# Run (Fetch) the given placeholder, but after feeding in the value in `hello`
print session.run(string_tensor, feed_dict={string_tensor: hello})

# Close Session
session.close()

Tensor("Placeholder:0", dtype=string)
Hello World!


Play around with the above example to make sure that everything makes sense.

### Checkoff - Placeholders, Operations, and Sessions ###

The following code block is incomplete. Using your knowledge of Placeholders, the Computation Graph, and Sessions, have the following code print "Hello NAME!" after reading your name from STDIN.

Hint: You might want to look at the Tensorflow API/Stack Overflow for how to concatenate String Tensors.

Fill out the code block, and have your Lab TA check you and your partner off after you succeed.

In [6]:
# Get your name from stdin
name = raw_input("Enter your name here: ")

# Define a Placeholder Tensor for your name
name_tensor = tf.placeholder(dtype=tf.string)

# Define a Constant Tensor
hello_tensor = tf.constant("Hello ")
exclamation_tensor = tf.constant("!")

# OPERATIONS GO HERE
hello_name_tensor = hello_tensor + name_tensor + exclamation_tensor

# Create a Session
session = tf.Session()

# SESSION LOGIC GOES HERE
hello_name = session.run(hello_name_tensor,feed_dict={name_tensor:name})
print hello_name

# Close Session
session.close()

Enter your name here: Babak
Hello Babak!


### Putting it all Together - Placeholder Shapes, Variables, and Neural Networks ###

There are two other big things to understand about Tensorflow. The first is related to Placeholders, like we've seen before. Whereas in the above examples, we only define our placeholders with a dtype (the type of input it will hold), placeholders usually are also defined by their *shape*, or matrix dimensions (like in Numpy). 

Consider the following:

In [7]:
mnist_placeholder = tf.placeholder(dtype=tf.float32, shape=[784])

Here, we define a placeholder of type float32, with a shape (dimension) of 784. In other words, our mnist_placeholder stores float vectors with 784 elements (hmm, seems awfully familiar).

The other big thing in Tensorflow are **Variables.** Like we've seen in class, neural networks are defined by a set of parameters, that we learn during the training process. These parameters are **Variables** that are special types of Tensors that are slightly different than the placeholders, or the constants we've looked at before. 

Furthermore, Variables are defined with a special syntax, and must be initialized (via a call to session.run) first, before any other evaluation.

To make this more clear, consider the following Variables (these should look very familiar, from your first homework).

In [8]:
# Create Variable for MNIST Classifier Weight - initialize Variable to be zero matrix, of shape [784, 10]
W = tf.Variable(initial_value=tf.zeros(shape=[784, 10]))
print W

# Create Variable for MNIST Classifier Bias - initialize Variable ot be zero vector with 10 elements
b = tf.Variable(initial_value=tf.zeros(shape=[10]))
print b

# Create Session
session = tf.Session()

# Initialize all Variables => Special call => REMEMBER THIS!
session.run(tf.global_variables_initializer())

# Close Session
session.close()

<tf.Variable 'Variable:0' shape=(784, 10) dtype=float32_ref>
<tf.Variable 'Variable_1:0' shape=(10,) dtype=float32_ref>


### Checkoff - MNIST Classifier in Tensorflow ###

We now have all the pieces we need to start using Tensorflow effectively. First, start by reading and working through the Tensorflow MNIST Tutorial: https://www.tensorflow.org/get_started/mnist/beginners

It has a lot of information we've already covered, and a lot of the code you'll need to become familiar with to start writing Neural Network Models in Tensorflow, including activation functions, matrix multiplication, and training via SGD. Note that in Tensorflow, all gradient calculations are taken care of for you.

Read the tutorial, and feel free to run the provided code, to make sure you understand what's going on. After doing so, fill out the following code so that it performs identically to the code you wrote in the first homework assignment.

Get checked off by your Lab TA to successfully complete Lab 02 - Tensorflow!

In [13]:
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

# Read MNIST Dataset from TF Helper
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Setup Parameters
input_size, num_classes = 784, 10
num_train_steps = 10000
learning_rate = 0.5

# Create Placeholders, Variables
x_tensor = tf.placeholder(tf.float32, [1,784])     # Placeholder for Input Image => Single Vector, Dimension [1, 784]
y_tensor = tf.placeholder(tf.float32, [1,10])     # Placeholder for Output Class => Vector, One-Hot, Dimension [1, 10]
W = tf.Variable(tf.random_normal([784,10], stddev=.1),trainable=True)            # Weights
b = tf.Variable(tf.random_normal([1,10], stddev=.1),trainable=True)            # Bias

# Use operations to generate final logits (no softmax)
logits = tf.matmul(x_tensor,W) + b

# Get probabilities by using the softmax activation on the given logits
probabilities = tf.nn.softmax(logits)

# Compute Loss Value via TF Loss Helper
loss = tf.losses.softmax_cross_entropy(onehot_labels=y_tensor, logits=logits)

# Create Gradient Descent Optimizer, training operation for updating weights
sgd = tf.train.GradientDescentOptimizer(learning_rate)
train_op = sgd.minimize(loss)

# Create Session
session = tf.Session()

# Initialize all Variables
session.run(tf.global_variables_initializer())

# Training Loop!
for i in range(num_train_steps):
    # Get next element from the MNIST Training Data
    next_x, next_y = mnist.train.next_batch(1)
    
    # Collect/Run Loss, Training Operation via single call to session.run (note multiple fetches!)
    l, _ = session.run([loss, train_op], feed_dict={x_tensor: next_x, y_tensor: next_y})
    
    # Print Loss every so often
    if i % 1000 == 0:
        print 'Iteration %d\tLoss Value: %.3f' % (i, l)
        
# Evaluate Accuracy on Test Data
correct, test_x, test_y = 0.0, mnist.test.images, mnist.test.labels
for i in range(10000):
    next_x, next_y = test_x[i], test_y[i]
    p = session.run([probabilities], feed_dict={x_tensor: [next_x], y_tensor: [next_y]})
    if np.argmax(p[0]) == np.argmax(next_y):
        correct += 1
print 'Test Accuracy: %.3f' % (correct / 10000.0)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

Iteration 0	Loss Value: 2.574
Iteration 1000	Loss Value: 0.000
Iteration 2000	Loss Value: 60.470
Iteration 3000	Loss Value: 0.000
Iteration 4000	Loss Value: 0.000
Iteration 5000	Loss Value: 0.000
Iteration 6000	Loss Value: 0.262
Iteration 7000	Loss Value: 0.000
Iteration 8000	Loss Value: 0.000
Iteration 9000	Loss Value: 0.000
Test Accuracy: 0.845
