# Tensorflow Basics Tutorial
## What is TensorFlow?
TensorFlow is a library for mathematical computation. It has many applications, but is primarily used for neural networks. Although there are wrapper libraries for C++ and Java, we will be using the Python API because it is the most developed.

In [1]:
# This is a comment. Any text after the "#" is not run by Python and is there to describe the code.

# The next few lines make sure that the code works in both Python 3 and Python 2 (feel free to ignore if you don't understand). 
from __future__ import print_function
from __future__ import division
from __future__ import absolute_import

# In order to use TensorFlow, we need to tell Python that we want to use it.
import tensorflow as tf     # import tensorflow, any commands will be prefixed with "tf"
import numpy as np          # import numpy, any commands prefixed with "np"

## Tensors
A tensor is just a multidimensional array, similar to a ndarray in numpy - any value we use in tensorflow will be a tensor.

In [2]:
tf.reset_default_graph()  # Don't worry about what this does, we will explain later

x = tf.constant([[2,4],[1,3]], name='example1')    # create a new tf.constant called "x". It is a 2x2 Tensor. 
print(x)

x = x*5     # multiply each element of "x" by 5. Re-assign "x" to this new tensor object. 
print(x)

x = x+1     # add 1 to each element of "x2. Re-assign "x" to this new tensor object.
print(x)

Tensor("example1:0", shape=(2, 2), dtype=int32)
Tensor("mul:0", shape=(2, 2), dtype=int32)
Tensor("add:0", shape=(2, 2), dtype=int32)


Each tensor object has a value that either we specify directly (like in the 1st example) or is the result of other operations (like the 2nd and 3rd examples). Printing a Tensor displays the following:
- name
- shape
- data type  

Much like Java objects or C/C++ pointers, each time we re-assign `x` to a new tensor, the original tensor still exists on the computer somewhere - we have just made `x` refer to a different tensor object (you will notice that the name changed). The original Tensors never were modified, instead each time we did an operation on the tensor it created a new Tensor.
  
More importantly, we cannot get the actual value using the `print()` command. This is because all we have done so far is tell TensorFlow *how* to compute x, but did not tell it to actually *do* any computations. Tensors *represent* values, rather than *being* the actual values themselves. 

## Sessions
To tell TensorFlow to actually compute things, we need to create a session. A good way to think about a session is as an administrator that manages all the tensors and operations you have defined. It will compute only what needs to be computed, keep track of variables we will need later, and try to optimize the speed of the operations.

In [3]:
y = tf.constant(2)   # Another constant Tensor. This one is just a scalar (a single number)
z = x/y              # Make a new tensor by dividing x (from before) by y. Assign it to z

print('Value of x in Python:', x) 
print('Value of y in Python:', y) 
print('Value of z in Python:', z) 

Value of x in Python: Tensor("add:0", shape=(2, 2), dtype=int32)
Value of y in Python: Tensor("Const:0", shape=(), dtype=int32)
Value of z in Python: Tensor("truediv:0", shape=(2, 2), dtype=float64)


In [4]:
sess = tf.InteractiveSession() # To actually compute things, we need to create a session!

In [5]:
# The eval() function asks the session to compute the value of the Tensor and return the result.
x_eval = x.eval() 
y_eval = y.eval()
z_eval = z.eval()

print('Value of x in TensorFlow: \n', x_eval, '\n Data type of the result:', type(x_eval), '\n')
print('Value of y in TensorFlow: \n', y_eval, '\n Data type of the result:', type(y_eval), '\n')
print('Value of z in TensorFlow: \n', z_eval, '\n Data type of the result:', type(z_eval), '\n')

Value of x in TensorFlow: 
 [[11 21]
 [ 6 16]] 
 Data type of the result: <class 'numpy.ndarray'> 

Value of y in TensorFlow: 
 2 
 Data type of the result: <class 'numpy.int32'> 

Value of z in TensorFlow: 
 [[  5.5  10.5]
 [  3.    8. ]] 
 Data type of the result: <class 'numpy.ndarray'> 



Again, notice how `print(x)` can't give us the value, but `print(x.eval())` does! `x.eval()` returns a numpy ndarray because now TensorFlow needs a way to show the actual value as a multidimensional array, and numpy is the best way to do that.

Although `x.eval()` is a convenient way to evaluate `x`, it is more common instead to use the `session.run()` method. `session.run()` takes a list of Tensors as its first argument, and then evaluates each and returns them. This list of Tensors is called the "fetches".

In [6]:
# In Python, you can unpack a list like this: `a,b,c = [x1,x2,x3]`
# Since sess.run is asked to fetch the values of 3 Tensors, it returns a list of 3 ndarrays.
# They are then assigned to each of `x_eval`, `y_eval`, and `z_eval`
x_eval, y_eval, z_eval = sess.run([x, y, z])

print('Value of x in TensorFlow: \n', x_eval, '\n')
print('Value of y in TensorFlow: \n', y_eval, '\n')
print('Value of z in TensorFlow: \n', z_eval, '\n')

Value of x in TensorFlow: 
 [[11 21]
 [ 6 16]] 

Value of y in TensorFlow: 
 2 

Value of z in TensorFlow: 
 [[  5.5  10.5]
 [  3.    8. ]] 



## Graphs
A Tensorflow Graph is how TensorFlow internally represents and remembers all the operations and Tensors that you define. TensorFlow uses the graph to know exactly which Tensors to compute based on the fetches you provide in the `session.run()` method. TensorFlow automatically uses a default graph which is why we never had to call any functions to make one, although we can manually specify one if we want to.

Below is the graph for a simple Logistic Regression:

<img src ='http://ischlag.github.io/images/graph_mess.png', alt='Image from Imanol Schlag blog'>

( ͡°_ʖ ͡°) Wow thats pretty complicated actually.

Luckily TensorFlow has a way for us to group our tensors and operations together using `tf.variable_scope()`. We will get into how to actually use it at another time, but here is the same graph after grouping some of these operations together:

<img src ='http://ischlag.github.io/images/graph_example.png', alt='Image from Imanol Schlag blog'>

( ͡~ ͜ʖ ͡°) Much better.

Each node in the graph takes in Tensors and outputs Tensors. The lines (directed edges) in the graph represent the flow of data through the graph, which means they also tell TensorFlow which operations depend on which. This is how TensorFlow knows that to compute the loss value, it first needs to compute the softmax function, and the matrix multiplication, etc. The graph can also be cleared at any time using `tf.reset_default_graph()`.

Don't worry too much about the details of this specific graph!

## TensorFlow Objects
### Constants
A `tf.constant` is a tensor that does not change: it remains the same throughout the duration of a Session. It can be used to store values and is hard coded into the graph. 

### Placeholders
In our first example, we used a `tf.constant` as the source of the data we wished to operate on - the problem with this is that data can never change. What we really want is a way for us to be able to have some Tensor that acts as a placeholder for our data, so that we can then feed in whatever data we have (e.g. new training/test data!).

This is what the `tf.placeholder` is for. A placeholder is a Tensor that can be used to feed data into, and cannot be evaluated directly until it receives data. Once we have it, we can feed values into it by using the `feed_dict` argument of the `session.run()` method. We pass a Python dict into the `feed_dict` argument, where the dict tells us which placeholders are receiving which data. If we try to evaluate a Tensor that depends on a tf.placeholder without properly feeding in data using the `feed_dict` argument, TensorFlow will throw an error.

In [7]:
sess.close()                 # close the old session to get rid of all the old code
tf.reset_default_graph()     # get rid of the previous graph

# shape=[None,2] means that the data can have any sized first dimension, but must have second dimension size of 2.
x = tf.placeholder(tf.float32, shape=[None, 2], name='x')
print(x)

x2 = x**2                    # square each element of x
print(x2)   

Tensor("x:0", shape=(?, 2), dtype=float32)
Tensor("pow:0", shape=(?, 2), dtype=float32)


Notice that we had to specify a shape for `x`. This is because TensorFlow wants to know beforehand the shape of the Tensor so that it can properly define the graph. Because we set the shape as `[None, 2]`, `x` can be fed data of shape `[1,2]` or `[1337,2]`, but not `[2,3]` or `[10,4]` or `[12, 2, 2]` for example. 

In [8]:
sess = tf.InteractiveSession()

try:
    sess.run(x2)             # will throw an error
    print("Success?")        # we won't ever get to print this because we will get an error before we reach it

except Exception as err:     # if an error occurs, the following code should execute
    print('Error!')

Error!


We got an error because when we try to evaluate `x2`, TensorFlow looks at the graph and sees that it depends on `x`. But we never fed any data into `x`!   
Now let's do it correctly:

In [9]:
input_1 = np.array([[2,4],
                    [1,3]])  # 2x2 ndarray (numpy multidimensional array) as input

input_2 = np.array([[5,6],
                    [7,8],
                    [9,10]]) # 3x2 ndarray

#In python, a dict is a structure that maps keys to values, and looks like {key1: value1, key2: value2, ...}
#`{x: input_1}` is a dict that says that the Tensor `x` will be fed the data from the ndarray `input_1`

print('Result 1:\n', sess.run(x2, feed_dict={x: input_1}))
print('\n Result 2:\n', sess.run(x2, feed_dict={x: input_2}))

Result 1:
 [[  4.  16.]
 [  1.   9.]]

 Result 2:
 [[  25.   36.]
 [  49.   64.]
 [  81.  100.]]


### Variables
So we have `tf.constant` which always keeps the same value, and `tf.placeholder` which we can manually set the value of for each `session.run()` call. However, we have nothing that can hold a state between `session.run()` calls that can be updated and read from by TensorFlow. To fix this, we will use a `tf.Variable`. 

You can use a `tf.Variable` anywhere you would use a Tensor, although you can also update the `tf.Variable` using TensorFlow operations. Unlike Tensors, a `tf.Variable` exists outside the context of a single `session.run()` call.

We primarily use `tf.Variable` to hold our parameters (weights, bias, etc) because TensorFlow can update them over the course of several `session.run()` calls. To create one, you must give it an "initializer", which can just be some other Tensor. This is what the `tf.Variable` will use as its initial value, although it can be updated later. 

In [10]:
tf.reset_default_graph()    # clear graph
sess.close()                # close the previous session

x = tf.Variable(tf.random_normal([10,2]), name="X")  #Create the `tf.Variable from a random Tensor of shape [10,2]`
print("Value of x:", x)
print("Type of x:", type(x))

Value of x: Tensor("X/read:0", shape=(10, 2), dtype=float32)
Type of x: <class 'tensorflow.python.ops.variables.Variable'>


Importantly, a `tf.Variable` has to be initialized before using it in a session. The following code will throw an error:

In [11]:
sess = tf.InteractiveSession()
try:
    print(sess.run(x))
    print('Success?')  # Won't ever reach this because of the error
except Exception:
    print('Error!')

Error!


 For this reason we must call `tf.global_variables_initializer()` before trying to evaluate the `tf.Variable`. This runs the operation or tensor that was specified as the initializer for the `tf.Variable`, and uses it to assign the `tf.Variable` its initial value.

In [12]:
init = tf.global_variables_initializer()  # This is just a TensorFlow operation. Nothing is run just yet.

sess.run(init)  # Now we actually run the initialization op

print(sess.run(x))  # Evaluate the `tf.Variable`

[[-0.73532027  0.42688787]
 [-1.07123363 -0.71279907]
 [-0.53311396 -0.54464442]
 [ 2.07838821  1.40642881]
 [-0.56161886 -0.31596965]
 [-1.15365088 -0.09548074]
 [ 0.99361444 -1.09512687]
 [ 2.47602916  0.90689105]
 [ 0.54518884  0.01539329]
 [-1.09747756 -0.35992196]]


It might still not be clear how this is any different than what we have been doing so far, however the true power of `tf.Variable` will become evident in the following section.

## Optimization
One strength of Tensorflow is that the programmer doesn't have to know how to calculate the gradient of the function being optimized. Instead, the library comes with several optimizers commonly used in deep learning, including Gradient Descent, Momentum, and ADAM. Below we use an optimizer to minimize the function $y=x^2$ given a random initial $x$. Obviously the correct value is for $x$ to just be zero, so lets see if TensorFlow figures that out.

In [13]:
tf.reset_default_graph()            # clear the graph
sess.close()                        # close the previous session
sess = tf.InteractiveSession()      # start a new one

x = tf.Variable(5.0)                # let x be a variable with initial value x=5.0
y = x**2                            # let y = x^2

train_step = tf.train.GradientDescentOptimizer(0.1).minimize(y) # gradient descent to minimize y, with learning rate=0.1

sess.run(tf.global_variables_initializer())     # initialize all variables

for i in range(40):                             # run 40 training steps of gradient descent
    x_val, y_val = sess.run([x, y])
    if i%5 == 0:
        print ('Step', '{:1}'.format(i), ': x =', '{0:.3f}'.format(x_val), 'y =', '{0:.3f}'.format(y_val))
    sess.run(train_step)

Step 0 : x = 5.000 y = 25.000
Step 5 : x = 1.638 y = 2.684
Step 10 : x = 0.537 y = 0.288
Step 15 : x = 0.176 y = 0.031
Step 20 : x = 0.058 y = 0.003
Step 25 : x = 0.019 y = 0.000
Step 30 : x = 0.006 y = 0.000
Step 35 : x = 0.002 y = 0.000


Notice that the value of the variable `x` changes with training, and converges to (nearly) the global minimum at $x=0$ as expected; it is not exact because gradient descent is an approximation.  The Session is still responsible for running the optimization operation, and a list of operations/tensors can be passed to the `session.run()` function to extract multiple values from the model. We don't need a feed_dict because there are no placeholders defined.