Now that we've gotten our hands dirty with Python and some of its important operations, it's time to take on Tensorflow. Tensorflow is Google's language platform for deep learning, and it is one - if not the - most important platforms for deep learning available to the public. It can be used on its own or through higher-level frameworks such as Caffe and Keras, which can be set to use Tensorflow as a backend. 

*Note: consider adding for visual clarity https://blog.jakuba.net/2017/05/30/tensorflow-visualization.html*

### Foreword - Take a Deep Breath..

Fear not, but Tensorflow is extraordinarily complex and can be quite complicated - almost too so. It has layers upon layers of abstractions and a horde of APIs which seem to gain and fall from prominence each day. Backwards compatibility is often an issue. Such is the life of a rapidly developin ecosystem I suppose. 

Because of the above, we will come across very many methods of writing Tensorflow programs as we explore some of the most interesting models currently available. It is COMPLETELY NORMAL to feel overwhelmed and confused at first. Do not worry about this. Our method in this course is not to try and understand everything at once, but focus on areas of interest to understand how they work and how we may work with them. We will expand our knowledge in this fashion, by combining what we learn. In time, the whole picture of Tensorflow and machine/deep learning will become clearer.

### High-Level / Low-Level

One can create the same model in 30 lines of code or in hundreds of lines. We are going to do both. Why? Well, the 30-line model is so *high-level* that we won't really understand what those 30 lines of code are doing, for everything is hidden in high-level abstractions. So we can look at lower-level code, which will have many more lines but show us more clearly what exactly is happening. Once we understand how the model works on a lower-level, we can work more knowledgeably and quickly with higher-level programming again.


### Tensorflow and Python - BFFs Forever?

Tensorflow works hand-in-hand with the Python framework and many of its methods are basically exact replicas of Python/Numpy code. This is generally a good thing because it gives us familiarity as we work between the two languages. The reason for this partnership is that their similarities and compatibility support a fundamental difference in execution: **immediate/eager evaluation** vs. **graph computation**

In short, Python will evaluate (run) the code as soon as it sees it. Tensorflow, on the other hand, does not immediately run its operations. Instead, it builds a so-called *graph*, which can be considered like a system of how each variable and operation are connected to one another. In order to execute (run) the graph, we must explicitly tell Tensorflow to do so. This will become apparent in the example code below.

Final note - this introductory code has been taken from several sources, who often do a better job explaining then I can, so we will follow along with some external sites as we go. To start with, please open [this tutorial](https://jacobbuckman.com/post/tensorflow-the-confusing-parts-1/#understanding-tensorflow) to read along with the example code below.

In [1]:
import tensorflow as tf

# create a tf.Variable
one_node = tf.Variable(2)

#create a Python variable
py_var = 2

print("tf graph element: %s" % (one_node,),"python variable execution: %s" % (py_var,))

  return f(*args, **kwds)
  from ._conv import register_converters as _register_converters


tf graph element: <tf.Variable 'Variable:0' shape=() dtype=int32_ref> python variable execution: 2


Note that the Tensorflow code above did not evaluate, but returned instead a graph element called a **tensor** (more on that later). On the other hand, the Python variable was immediately evaluated and returned 2.

This creates some very important differences. For example, in Python if I assign 2 to the same variable, we still have 1 variable with a value of 2. If I assign this variable a value of 3, then we have the same 1 variable but with a value of 3. In Tensorflow, the graph does something quite different. Let's see for ourselves below.

In [2]:
two_node = tf.constant(2)
print(two_node)

Tensor("Const:0", shape=(), dtype=int32)


In [3]:
two_node = tf.constant(2)
two_node = tf.constant(2)
two_node = tf.constant(2)
tf.constant(3)

<tf.Tensor 'Const_4:0' shape=() dtype=int32>

Notice that in the first code block we have 'Const' and then in the second code block we have 'Const_4'. This means each time we call tf.constant it adds a tf.constant to the graph regardless of where we assign the variable in our code.

If we don't wish for this to happen, then we should create "pointer" variables instead of calling new instances of tf variables.

In [4]:
pointer_at_two_node = two_node
print(pointer_at_two_node)

Tensor("Const_3:0", shape=(), dtype=int32)


Above, we can notice 2 things. First, we didn't create another tf.constant but simply pointed the variable already storing it. Also, the tf.constant is 'Const_3' instead of 'Const_4' because we created 'Const_4' without assigning it to a Python variable. The variable we are "pointing" to contains 'Const_3'

So what we have created so far are called **nodes** in tensorflow - that is, individual values. When we want to compute them together somehow, this is called an **operation**.

In [5]:
two_node = tf.constant(2)
three_node = tf.constant(3)
sum_node = two_node + three_node ## equivalent to tf.add(two_node, three_node)
print(sum_node)

Tensor("add:0", shape=(), dtype=int32)


Though very simple, we have our 1st **computational graph**. Notice that it has not been evaluated yet, for when we print sum_node we get the operation "add", not 5 like we would if this were just in Python.

In order to evaluate a graph, we need to begin what is called a **session**. This means the graph is interactive and can be executed (evaluated)

In [6]:
sess = tf.Session()

print(sess.run(sum_node))

5


Perhaps we want the values of nodes as well as operations

In [7]:
print(sess.run([two_node, sum_node]))

[2, 5]


And we can return them to Python variables if we want

In [8]:
node_val, op_result = sess.run([two_node, sum_node])
print(node_val, op_result, node_val+op_result)

2 5 7


Sometimes when creating a graph, we want to add in values at a later point (for instance, perhaps we need to insert an image file into our graph and evaluate it). To do so, we can make use of **tf.placeholder** and **feed_dict**. As the name suggests, placeholder simply holds an empty location in the graph. It has no value until we pass in a value using feed_dict. 

Below we'll feed a tf.placeholder with a Python variable of value 2.

In [9]:
input_placeholder = tf.placeholder(tf.int32)
feed_value = 2
print(sess.run(input_placeholder, feed_dict={input_placeholder: feed_value}))

2


So far we've evaluated graphs that contain static or "no-ancestor" nodes - they don't update. However, with machine/deep learning, **we often need nodes that update** - this is how they learn! Most of the **parameters** we want to train/learn will be implemented as Tensorflow **variables**. 

Also, we've only been dealing with **singular values**, called **scalars** so far. Generally, we will be working with **matrices (and "tensors")** of varying dimensions. To do this, we assign variables a **shape**, which is their dimensions. 

Finally, we can also give the variable a **name** to make it easier to identify, or Tensorflow will do it for us (e.g. 'const_5', etc.)

In [10]:
count_variable = tf.get_variable("count", [2, 3])
zero_node = tf.constant([[1, 2, 3], [4, 5, 6]])
assign_node = tf.assign(count_variable, zero_node)
sess = tf.Session()
sess.run(assign_node)
print(sess.run(count_variable))

TypeError: Input 'value' of 'Assign' Op has type int32 that does not match type float32 of argument 'ref'.

Note we received an error exception for data type incompatibility. We often need to declare which data type we want to work with to avoid this error

In [11]:
count_variable = tf.get_variable("count", [2, 3], dtype=tf.float32)
zero_node = tf.constant([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
assign_node = tf.assign(count_variable, zero_node)
sess = tf.Session()
sess.run(assign_node)
print(sess.run(count_variable))

ValueError: Variable count already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "<ipython-input-10-5dd14a5ec98a>", line 1, in <module>
    count_variable = tf.get_variable("count", [2, 3])
  File "/Users/jonathansherman/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/Users/jonathansherman/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):


Oh, now another error exception - this time for variable reuse. This is because we have given it a specific name, so Tensorflow cannot simply add another variable like it did when we allowed it to autoname variables. And because the error in our code was further below the count_variable, it thinks we are trying to add it again and it is blocking us. Simple workaround when this happens - comment out the count_variable until the block of code works, then include it next time

In [12]:
#count_variable = tf.get_variable("count", [2, 3], dtype=tf.float32)
zero_node = tf.constant([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
assign_node = tf.assign(count_variable, zero_node)
sess = tf.Session()
sess.run(assign_node)
print(sess.run(count_variable))

[[1. 2. 3.]
 [4. 5. 6.]]


Or reset the "default graph" which will clear variables and let us start anew.

In [13]:
tf.reset_default_graph()
count_variable = tf.get_variable("count", [2, 3], dtype=tf.float32)
zero_node = tf.constant([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
assign_node = tf.assign(count_variable, zero_node)
sess = tf.Session()
sess.run(assign_node)
print(sess.run(count_variable))

[[1. 2. 3.]
 [4. 5. 6.]]


Great, so it finally worked, but perhaps is still confusing. Why this "get_variable" and "assign" process? Why can't we just create a variable in one step. Well, we can with **tf.Variable()**, but the "pointer" method above makes sure we aren't creating unnecessary graph elements like we've shown before. For instance, if there already is a variable with the name we are looking for, we don't want to create another. tf.Variable() will always try to create another.

In [14]:
blocked_additional_count_variable = tf.get_variable("count", [2, 3], dtype=tf.float32)

ValueError: Variable count already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "<ipython-input-13-63a0e12acad5>", line 2, in <module>
    count_variable = tf.get_variable("count", [2, 3], dtype=tf.float32)
  File "/Users/jonathansherman/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/Users/jonathansherman/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):


In [15]:
create_additional_count_var = tf.Variable(0, name = 'count')
print(create_additional_count_var)

<tf.Variable 'count_1:0' shape=() dtype=int32_ref>


See above, get_variable was an error exception as we expected, whereas tf.Variable created a new node with a 0 value, autonamed 'count_1'. If this is not what we wanted, we will have problems with our graph.

Finally, there is another way to create and evaluate variables other than using the 'assign' method. We can use "initializers", which will set all variables or constants to an initial value. Since these will generally be updated by our training, we often don't care what the initial value is, or just set it to 0.

In [None]:
const_init_node = tf.constant_initializer(0.)
count_variable = tf.get_variable("count", [], initializer=const_init_node)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
print(sess.run(count_variable))

Oh yeah, that annoying error again. We've "accidentally" created a bunch of nodes we don't want, and are trying to change an existing node. Let's reset all of the variables and get this right by using **tf.reset_default_graph**

In [None]:
tf.reset_default_graph()

const_init_node = tf.constant_initializer(0.)
count_variable = tf.get_variable("count", [], initializer=const_init_node)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
print(sess.run(count_variable))

Hopefully this gives you an idea of how Tensorflow works with but is different than Python, and how the concept of building a graph with nodes and operations differes and *then* evaluating it is different than the immediate evaluation we are probably used to with Python or other coding languages.

There is plenty more to learn about Tensorflow, but this is a good start. 

Let's see how this is put together in a toy **linear regression** example.

Most importantly, pay attention to how the variables are updated as the model trains and evaluates. This is called **optimization**, and a typical optimization will do something along the likes of the following:

1. Get an input and true_output
2. Compute a “guess” based on the input and your parameters
3. Compute a “loss” based on the difference between your guess and the true_output
4. Update the parameters according to the gradient of the loss

In [None]:
### build the graph
## first set up the parameters
m = tf.get_variable("m", [], initializer=tf.constant_initializer(0.))
b = tf.get_variable("b", [], initializer=tf.constant_initializer(0.))
init = tf.global_variables_initializer()

## then set up the computations
input_placeholder = tf.placeholder(tf.float32)
output_placeholder = tf.placeholder(tf.float32)

x = input_placeholder
y = output_placeholder
y_guess = m * x + b

loss = tf.square(y - y_guess)

## A "Hyperparameter" - you control this!
learning_rate = 0.001

## finally, set up the optimizer and minimization node
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(loss)

### start the session
sess = tf.Session()
sess.run(init)

### perform the training loop
import random

## set up problem
true_m = random.random()
true_b = random.random()

for update_i in range(10000):
    ## (1) get the input and output
    input_data = random.random()
    output_data = true_m * input_data + true_b
    
    ## (2), (3), and (4) all take place within a single call to sess.run()!
    _loss, _ = sess.run([loss, train_op], feed_dict={input_placeholder: input_data, output_placeholder: output_data})
    print(update_i, _loss)
    

### finally, print out the values we learned for our two variables
print("True parameters:     m=%.4f, b=%.4f" % (true_m, true_b))
print("Learned parameters:  m=%.4f, b=%.4f" % tuple(sess.run([m, b])))

Often times if we are developing a model, we want to inspect the graph and its variables and performance. We can (and usually will) just print or plot the values for quick visualization. However, Tensorflow has greater tools for inspecting our graph and model, such as Tensorboard. Below we will run the same toy linear regression model but save the variable data we are interested in. 

In [None]:
import datetime

tf.reset_default_graph()

### build the graph
## first set up the parameters
m = tf.get_variable("m", [], initializer=tf.constant_initializer(0.))
## save our variable for Tensorboard
tf.summary.histogram("m_prediction",m)
b = tf.get_variable("b", [], initializer=tf.constant_initializer(0.))
tf.summary.histogram("b_prediction",b)

init = tf.global_variables_initializer()

## then set up the computations
input_placeholder = tf.placeholder(tf.float32, name="input")
output_placeholder = tf.placeholder(tf.float32, name="output")

x = input_placeholder
y = output_placeholder
tf.summary.histogram("y_true",output_placeholder)
y_guess = m * x + b
tf.summary.histogram("y_guess",y_guess)

loss = tf.square(y - y_guess)
tf.summary.scalar("loss", loss)

## finally, set up the optimizer and minimization node
optimizer = tf.train.GradientDescentOptimizer(1e-3)
train_op = optimizer.minimize(loss)

### start the session
sess = tf.Session()
## merge all our Tensorboard summaries and create file directory
summaryMerged = tf.summary.merge_all()
filename="./summary_log/run"+datetime.datetime.now().strftime("%Y-%m-%d--%H-%M-%S")
writer = tf.summary.FileWriter(filename, sess.graph)

## run variables initializer
sess.run(init)

### perform the training loop
import random

## set up problem
true_m = random.random()
true_b = random.random()

for update_i in range(20000):
    ## (1) get the input and output
    input_data = random.random()
    output_data = true_m * input_data + true_b
    
    ## (2), (3), and (4) all take place within a single call to sess.run()!
    _loss, _, sumOut = sess.run([loss, train_op,summaryMerged], feed_dict={input_placeholder: input_data, output_placeholder: output_data})
    print(update_i, _loss)
    
    if update_i % 100 == 0:
        ## write all of the variables to file
        writer.add_summary(sumOut, update_i)

### finally, print out the values we learned for our two variables
print("True parameters:     m=%.4f, b=%.4f" % (true_m, true_b))
print("Learned parameters:  m=%.4f, b=%.4f" % tuple(sess.run([m, b])))

Now, open up your terminal console and cd to the same file directory location our program is in

Then run ```tensorboard --logdir=./summary_log/``` and copy the location http address into a new browser window. You should see our graph, scalars, histogram and even more. Run the model more than once, and you can compare the results between runs. Very useful!

For more details see the source I adapted this from: https://thecodacus.com/tensorboard-tutorial-visualize-networks-graphically/

We can also just visualize our graph inline with Jupyter via [the following code](https://stackoverflow.com/questions/38189119/simple-way-to-visualize-a-tensorflow-graph-in-jupyter):

In [None]:
import numpy as np
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [None]:
show_graph(tf.get_default_graph().as_graph_def())

You can zoom and click each node and operation to see its inputs and outputs, which can be very helpful for debugging or just understanding what is going on.

Furthermore and importantly, we can save and reload trained models. This is essential for any serious machine/deep learning project that takes significant time to train. Now, we can train something for weeks or even months, or even forever.

In [None]:
tf.reset_default_graph()

### build the graph
## first set up the parameters
m = tf.get_variable("m", [], initializer=tf.constant_initializer(0.))
## save our variable for Tensorboard
tf.summary.histogram("m_prediction",m)
b = tf.get_variable("b", [], initializer=tf.constant_initializer(0.))
tf.summary.histogram("b_prediction",b)

init = tf.global_variables_initializer()

## then set up the computations
input_placeholder = tf.placeholder(tf.float32)
output_placeholder = tf.placeholder(tf.float32)

x = input_placeholder
y = output_placeholder
tf.summary.histogram("y_true",output_placeholder)
y_guess = m * x + b
tf.summary.histogram("y_guess",y_guess)

loss = tf.square(y - y_guess)
tf.summary.scalar("loss", loss)

# a "hyperparameter"
learning_rate = 0.001

## finally, set up the optimizer and minimization node
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(loss)

### start the session
sess = tf.Session()
## merge all our Tensorboard summaries and create file directory
summaryMerged = tf.summary.merge_all()
filename="./summary_log/run"+datetime.datetime.now().strftime("%Y-%m-%d--%H-%M-%s")
writer = tf.summary.FileWriter(filename, sess.graph)

## create tf.saver
saver = tf.train.Saver()

## run variables initializer
sess.run(init)

### perform the training loop
import random

## set up problem
true_m = random.random()
true_b = random.random()

for update_i in range(20000):
    ## (1) get the input and output
    input_data = random.random()
    output_data = true_m * input_data + true_b
    
    ## (2), (3), and (4) all take place within a single call to sess.run()!
    _loss, _, sumOut = sess.run([loss, train_op,summaryMerged], feed_dict={input_placeholder: input_data, output_placeholder: output_data})
    print(update_i, _loss)
    
    if update_i % 100 == 0:
        ## write all of the variables to file
        writer.add_summary(sumOut, update_i)

## save to directory
save_path = saver.save(sess, './model/weights')

### finally, print out the values we learned for our two variables
print("True parameters:     m=%.4f, b=%.4f" % (true_m, true_b))
print("Learned parameters:  m=%.4f, b=%.4f" % tuple(sess.run([m, b])))
print("Model saved in path: %s" % save_path)

So now we should have a folder "model" with our weight files saved (these are all of our variables). We can now reload the model and continue working with it, without having to retrain from scratch. More on working with and saving models here: https://www.tensorflow.org/guide/saved_model

In [None]:
sess = tf.Session()

with tf.Session() as sess:
    # Reset all variables to initial state
    sess.run(init)
    
    # Verify reset variable
    print('reset value of var m',m.eval())
    
    reader = tf.train.Saver()
    reader.restore(sess, './model/weights')
    
    # Verify restored variable
    print('restored value of var m',m.eval())

One tricky but essential aspect of working with Tensorflow is managing graphs and sessions. Recall that the graph contains all the nodes and operations we build for our learning model. We call a session when we want to evaluate the graph.

Things can get confusing though if we:
1. have more than one graph, 
2. need to run more than one session,
3. want to reset or update graph elements
4. want to access graph elements for input or output

We won't cover all techniques here, but let's get a bit more familiar with graph and session usage before moving on.

We can assign our default graph (which is automatically instantiated by Tensorflow) to a variable if we wish (though normally we wouldn't - this is just to demonstrate. Normally, you can name a graph if you want at the outset or when constructing a 2nd, different graph).

In [None]:
graph = tf.get_default_graph()

In [None]:
# Verify it's the graph we want
show_graph(graph)

Another way to access all of the nodes in our graph is to call them via graph_def

In [None]:
graphdef = graph.as_graph_def()
for node in graphdef.node:
    print(node)

We can do similarly access all of the operations in our graph

In [None]:
for op in graph.get_operations(): 
    print(op.name)

And get a specific operation by name

In [None]:
some_op = graph.get_operation_by_name("gradients/mul_grad/Shape")
print(some_op)

Or layer and type

In [None]:
layers = [op.name for op in graph.get_operations() if op.type=='Const' and 'gradients' in op.name]
print(layers)

In [None]:
#close and clear our default graph
sess.close()
tf.reset_default_graph()

In [None]:
show_graph(tf.get_default_graph())

In [None]:
train_graph = tf.Graph()
eval_graph = tf.Graph()
infer_graph = tf.Graph()

x = tf.constant(0)

with train_graph.as_default():
    with tf.variable_scope("share", reuse=True):
        a = tf.constant([[-1, -2, -23], [-4, -5, -6]])
        b = tf.constant([[1, 2, 3], [4, 5, 6]])
        c = tf.add(a,b)

with eval_graph.as_default():
    with tf.variable_scope("share_again", reuse=True):
        c1 = c

with infer_graph.as_default():
    with tf.variable_scope("sharing_through_2", reuse=True):
        c2 = c1

In [None]:
show_graph(train_graph)