# Lab7a: Introduction to Tensorflow #

In this lab, you will start working with [Tensorflow](http://tensorflow.org/), a cutting-edge library for developing, and evaluating deep neural network models.

Tensorflow is a relatively new library (a few years old).  For this module, **assume all later labs and assignment will utilize Tensorflow Version 1.4**. 

First, start by reading and working through the Tensorflow MNIST Tutorial: https://www.tensorflow.org/get_started/mnist/beginners
Follow their link to installation help for setting up Tensor Flow.
[https://www.tensorflow.org/install/](https://www.tensorflow.org/install/). 

** After you have completed the bove MNIST for begginers tutorial**, complete the following three tasks.

### Task 1: Import Tensorflow ###

Run the following python code, it should return the tensor flow version you have installed with no errors.

In [None]:
import tensorflow as tf

print(tf.__version__)

## Tensorflow 101 - The Computation Graph ##

Writing Tensorflow code is very different than writing regular Python code. This is because the bulk of the computation in Tensorflow is executed in a separate process, containing high performing C code. However, instead of directly writing everything in C, Tensorflow defines a Python API, for both building the C computations, and for reading and writing data from/to this separate process.

Here are the things to keep in mind.

The core of Tensorflow is the **Computation Graph**. Tensorflow starts and ends with this graph - every operation you define, every intermediate value you calculate lives here.
    
  + This graph consists of nodes and edges. Nodes are **Tensors**, or matrices of varying dimensions (i.e 3D, 4D, etc.). The edges are **Operations** that take one or more tensors, and produce a new, resulting Tensor after applying a given transformation (i.e. addition, subtraction, matrix multiplication, etc.)
  + All of these Tensors and Operations exist in this separate, high performing process. That means you can't print/peek into the Tensors like you would in regular Python.
  + This graph defines a "flow" of Tensors, starting from an input, resulting in a desired output. Hence the library's name, **Tensorflow**.
  
There are two steps to every Tensorflow program:

  + Defining the graph: Use Python's Tensorflow API to set up the inputs, all the transformations, the operations to run, and the desired outputs to collect.
  + Interacting with the graph: Feed inputs into the Computation Graph, and collect outputs, transforming normal Python objects into Tensors to be read by the Computation Graph.
  
We'll be talking about how to define the Computation Graph soon. However, the crucial thing to understand is this second point - how to feed data to the computation graph, and how to read outputs back.

The device that allows us to communicate with the computation graph is the **Session.**

### Tensorflow Sessions - Portal to the Computation Graph ###

All of your interaction with the Computation Graph will happen through a **Session** object. Sessions allow you to not only load Python objects into the Graph, but they also allow you to run arbitrary operations, and fetch the values of given Tensors.

To better understand this, consider the following example:

In [None]:
import tensorflow as tf

# Define a Constant Tensor on the Computation Graph
hello = tf.constant('Hello World!')

# Print the Tensor - this is not a normal Python object - it's a Tensor!!!
print(hello)

# Create a Session
session = tf.Session()

# Run (Fetch) the given Tensor, and print it's value - this is a normal Python object (str)
print(session.run(hello))

# Close the session
session.close()

Sessions act a lot like files in Python. They need to be opened, via the special ```tf.Session()``` constructor, and assigned to a variable. Then, to read a Tensor value from a session, you need to call ```session.run(val)``` to evaluate the set of operations resulting in the given Tensor. You must then close the session, to end the interaction.

You can also *feed* values into a Computation Graph, via **feed_dicts** (feed dictionaries). These dictionaries provide a mapping between special Tensor objects called placeholders, that denote inputs coming from normal Python, and actual raw python objects.

An example is as follows:

In [None]:
# Define a normal Python string
hello = "Hello World!"

# Define a Placeholder Tensor on the Computation Graph - note that you have to define the type of a placeholder!
string_tensor = tf.placeholder(dtype=tf.string)

# Print the String Tensor - See what it looks like!
print(string_tensor)

# Create a Session
session = tf.Session()

# Run (Fetch) the given placeholder, but after feeding in the value in `hello`
print(session.run(string_tensor, feed_dict={string_tensor: hello}))

# Close Session
session.close()

Play around with the above example to make sure that everything makes sense.

### Task #2 Placeholders, Operations, and Sessions ###

The following code block is incomplete. Using your knowledge of Placeholders, the Computation Graph, and Sessions, have the following code print "Hello NAME!" after reading your name from STDIN.

Hint 1: You might want to look at the Tensorflow API/Stack Overflow for how to concatenate String Tensors in Tensor Flow.

Hint 2: Check out `tf.add`

In [None]:
# Get your name from stdin
name = input("Enter your name here: ")

# Define a Placeholder Tensor for your name
name_tensor = tf.placeholder(dtype=tf.string)

# Define a Constant Tensor
hello_tensor = tf.constant("Hello ")
exclamation_tensor = tf.constant("!")

# OPERATIONS GO HERE
hello_name_tensor = ?

# Create a Session
session = tf.Session()

# RUN YOUR SESSION HERE, AND DISPLAY THE RESULTS
hello_name = ?
print(hello_name)

# Close Session
session.close()

### Putting it all Together - Placeholder Shapes, Variables, and Neural Networks ###

There are two other big things to understand about Tensorflow. The first is related to Placeholders, like we've seen before. Whereas in the above examples, we only define our placeholders with a dtype (the type of input it will hold), placeholders usually are also defined by their *shape*, or matrix dimensions (like in Numpy). 

Consider the following:

In [None]:
mnist_placeholder = tf.placeholder(dtype=tf.float32, shape=[784])

Here, we define a placeholder of type float32, with a shape (dimension) of 784. In other words, our mnist_placeholder stores float vectors with 784 elements (hmm, seems awfully familiar).

The other big thing in Tensorflow are **Variables.** Like we've seen in class, neural networks are defined by a set of parameters, that we learn during the training process. These parameters are **Variables** that are special types of Tensors that are slightly different than the placeholders, or the constants we've looked at before. 

Furthermore, Variables are defined with a special syntax, and must be initialized (via a call to session.run) first, before any other evaluation.

Here is an example.

In [None]:
# Create Variable for MNIST Classifier Weight - initialize Variable to be zero matrix, of shape [784, 10]
W = tf.Variable(initial_value=tf.zeros(shape=[784, 10]))
print(W)

# Create Variable for MNIST Classifier Bias - initialize Variable ot be zero vector with 10 elements
b = tf.Variable(initial_value=tf.zeros(shape=[10]))
print(b)

# Create Session
session = tf.Session()

# Initialize all Variables => Special call => REMEMBER THIS!
session.run(tf.global_variables_initializer())

# Close Session
session.close()

### Task 3 - MNIST Classifier in Tensorflow ###

We now have all the pieces we need to start using Tensorflow effectively.  Below is a partially implemented MNISt Classifier. 

It has a lot of information we've already covered, and a lot of the code you'll need to become familiar with to start writing Neural Network Models in Tensorflow, including activation functions, matrix multiplication, and training via SGD. Note that in Tensorflow, all gradient calculations are taken care of for you.

Complete following code according to the provided comments, and try it out!

In [None]:
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

# Read MNIST Dataset from TF Helper
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# Setup Parameters
input_size, num_classes = 784, 10
num_train_steps = 10000
learning_rate = 0.5

# Create Placeholders, Variables
x_tensor = ?     # Placeholder for Input Image => Single Vector, Dimension [1, 784]
y_tensor = ?     # Placeholder for Output Class => Vector, One-Hot, Dimension [1, 10]
W = ?            # Weights, use tf.random_normal with stddev = .1
b = ?            # Bias, use tf.random_normal with stddev = .1

# Use operations to generate final logits (no softmax)
# logits = x*W + b
logits = ?

# Get probabilities by using the softmax activation on the given logits
probabilities = tf.nn.softmax(logits)

# Compute Loss Value via TF Loss Helper
loss = tf.losses.softmax_cross_entropy(onehot_labels=y_tensor, logits=logits)

# Create Gradient Descent Optimizer, training operation for updating weights
sgd = tf.train.GradientDescentOptimizer(learning_rate)
train_op = sgd.minimize(loss)

# Create Session
session = tf.Session()

# Initialize all Variables
session.run(tf.global_variables_initializer())

# Training Loop!
for i in range(num_train_steps):
    # Get next element from the MNIST Training Data
    next_x, next_y = mnist.train.next_batch(1)
    
    # Collect/Run Loss, Training Operation via single call to session.run (note multiple fetches!)
    l, _ = session.run([loss, train_op], feed_dict={x_tensor: next_x, y_tensor: next_y})
    
    # Print Loss every so often
    if i % 1000 == 0:
        print('Iteration %d\tLoss Value: %.3f' % (i, l))
        
# Evaluate Accuracy on Test Data
correct, test_x, test_y = 0.0, mnist.test.images, mnist.test.labels
for i in range(10000):
    next_x, next_y = test_x[i], test_y[i]
    p = session.run([probabilities], feed_dict={x_tensor: [next_x], y_tensor: [next_y]})
    if np.argmax(p[0]) == np.argmax(next_y):
        correct += 1
print('Test Accuracy: %.3f' % (correct / 10000.0))