# Tensorflow 101

_DISCLAIMER : this is NOT a Tensorflow tutorial/course. Please find and follow one, after this appetizer!_

In [1]:
import tensorflow as tf
import numpy as np

In [2]:
tf.__version__

'1.0.0'

## Create  a graph and run it in a TF session

One note on $reset\_graph()$. It is common while experimenting (on either Jupyter notebooks or a py shell) to run the same commands many times: in this specific case (graphs), as a result you may end up with a default graph containing many duplicate nodes. One solution is to rerun the py shell or restart the jupyter kernel frequently. Pretty inconvenient. More convenient is to just reset the default graph by running

    tf.reset_default_graph() 

or a more customized version, like the one below i.e.:

    reset_graph()

In [3]:
# to make this notebook's output stable across runs
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

In [None]:
reset_graph()

Start simple.

In [None]:
reset_graph()   #note: I put it at the beginning of the cell, everytime I want to create a new graph from scratch

a = tf.constant(3)
b = tf.constant(5)
s = a + b

What did I do?

In [None]:
a

In [None]:
b

In [None]:
s

I.e. "tensor" definition as **a piece of data that is flowing through a graph**: this is the reason behind the "TF" name, btw.

Where is this graph? If you specify nothing, you are working on the "default" one.

In [None]:
tf.get_default_graph()

Let's explain a bit.

A TensorFlow computation, in general, is represented as a dataflow Graph. A Graph contains a set of tf.Operation objects, which represent units of computation, and tf.Tensor objects, which represent the units of data that flow between operations. A default Graph is always registered, and accessible by calling tf.get_default_graph. 

To add an operation to the default graph, simply call one of the functions that defines a new Operation. I.e. in building a computation graph, any node you create is automatically added to the default graph.

In [None]:
reset_graph()   #note: I put it at the beginning of the cell, everytime I want to create a new graph from scratch

a = tf.constant(3)
b = tf.constant(5)
s = a + b

s.graph is tf.get_default_graph()
#assert s.graph is tf.get_default_graph()

Another typical usage involves the $tf.Graph.as\_default$ context manager, which overrides the current default graph for the lifetime of the context. 

(indeed, sometimes you want to manage multiple independent graphs: you can always create a new Graph and temporarily make it the default graph inside a $with$ block)

In [None]:
reset_graph() 

my_graph = tf.Graph()
with my_graph.as_default(): # i.e. "using my_graph as the default graph, do as follows.."
    a = tf.constant(3)      # Define operations and tensors in `my_graph`.
    b = tf.constant(5)
    s = a + b
    
s.graph is my_graph

But:

In [None]:
s.graph is tf.get_default_graph()

.. because you are out of the context where 'my_graph' was the default graph.

Try one more (in the default Graph).

In [None]:
reset_graph()

x = tf.Variable(3, name="x")   # note: Variable with capital V
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

Note (for very precise people): tf.constant() is a node in the graph (ops), while tf.Variable() is a class. And in PEP8 python style guide "Class names should normally use the CapWords convention".

In [None]:
x

In [None]:
y

In [None]:
f

In [None]:
tf.get_default_graph()

The code examples above (both) do not actually perform any computation. Each just creates a computation graph.

In fact, not only the constants, but even the Variables are not initialized yet. 

Once you create a computation graph, next step is to evaluate it. To do so, you need to open a TF **session**, and in it you **initialize** the variables and **evaluate** the function (e.g. f as in the second example):

* it is the TF session that takes care of placing the operations onto **devices** such as CPUs or GPUs and running them
* it is the TF session that holds all the variables values (e.g. you need to close the session to free up resources)

(_NOTE: in distributed-TF, this is on the servers, not on the session, though_)

Let's see an example of code that creates a session, initializes the variables, evaluates f, then closes the session.

In [None]:
reset_graph()

x = tf.Variable(3, name="x")   # note: Variable with capital V
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)

print(result)

sess.close()

OK - but it is a bit cumbersome to have to repreat many $sess$ commands, and to need $sess.run()$ all times, so fortunately there is a better and cleaner way (context management):

In [None]:
reset_graph()

x = tf.Variable(3, name="x")   # note: Variable with capital V
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()

print(result)

Above, $f.eval()$ means that you are asking to "_please evaluate the tensor f, i.e. run the whole graph (in a session, with variables initialised) to evaluate the tensor f_". It is also quite clean, as the session dies as soon as you go out of the context ($with$).

But.. do I really need to manually run the initializer once for every single variable I have? What if I have many?! Of course, there is neat way: you can use the $global\_variables\_initializer()$ function - see below. 

In [None]:
reset_graph()

x = tf.Variable(3, name="x")   # note: Variable with capital V
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

init = tf.global_variables_initializer()   # prepare an init node

with tf.Session() as sess:
    init.run()                             # actually inizialize all variables
    result = f.eval()
    
print(result)

Note that it does not actually perform the initialization immediately, but rather it prepares an init node, i.e. creates a node in the graph that will initialize all variables when it is run ("initialize to what" was already specified in the computation graph).

In [None]:
init

Note that - of course - initialisation is needed only for Variables, not for constants. E.g. if I used the nodes as in the previous example, and I initialize it is fine:

In [None]:
reset_graph()

a = tf.constant(3)
b = tf.constant(5)
s = a + b

init = tf.global_variables_initializer()   # prepare an init node

with tf.Session() as sess:
    init.run()                             # actually inizialize all variables
    result = s.eval()
    
print(result)


.. but actually inizialisation was unnecessary, e.g. if I do not do it, it still works:

In [None]:
reset_graph()

a = tf.constant(3)
b = tf.constant(5)
s = a + b

with tf.Session() as sess:
    result = s.eval()
    
print(result)

### Let's recap!

So, we are understanding that a TF piece of code is typically split into two parts:

* **construction phase**: you build a computation graph
* **execution phase**: you run it

If you are using TF for ML (NOTE THAT TF IS NOT ONLY FOR ML!)

* **construction phase**: you build a computation graph representing your ML model and the computations required to train it
* **execution phase**: you run a loop that evaluates a training step repeatedly (e.g. one step per mini-batch), gradually improving the model parameters.

# Lifecycle of a node value

When you evaluate a node, TF automatically determines the set of nodes that it depends on and it evaluates these nodes first.

In [None]:
reset_graph()

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

#init = tf.global_variables_initializer() 

with tf.Session() as sess:
#    init.run()
    print(y.eval())  # 10
    print(z.eval())  # 15
    

The code above:
* reset
* defines a very simple graph
* starts a session
* runs the graph to evaluate $y$
* automatically detects that $y$ depends on $x$, which in turn depends on $w$, so it evaluates $w$, then $x$, then $y$ in this order, and then returns a value of $y$
* runs the graph to evaluate $z$ (and again he needs first $w$ then $x$ to do so)

Note that is does _not_ re-use the results of the previous calculation of $w$ and $x$: the preceding code evaluates $w$ and $x$ twice! BAD!

**IMPORTANT: all node values are dropped between graph runs, except variable values, which are maintained by the session across graph runs. A variable starts its life when its initializer is run, and it ends when the session is closed.**

In order to evaluate $y$ and $z$ more efficiently, without evaluating $w$ and $x$ twice as in the preceding code block, you must ask TF to evaluate both $y$ and $z$ in just one graph run, as below:

In [None]:
reset_graph()

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3

init = tf.global_variables_initializer() 

with tf.Session() as sess:
    init.run()
    y_val, z_val = sess.run([y, z])
    print(y_val)  # 10
    print(z_val)  # 15

This allows TF to be flexible and evaluate only things upon requests.

_NOTE: In single-process TensorFlow, multiple sessions do not share any state, even if they reuse the same graph (each session would have its own copy of every variable). In distributed TensorFlow, variable state is stored on the servers, not in the sessions, so multiple sessions can share the same variables._

## EXERCISEs

1. Create a simple graph that calculates $ c = \exp(\sqrt 8 + 3) $. 

2. Now create a Session() and evaluate the operation that gives you the result of the equation above

3. Create a graph that evaluates and prints both $ b = \sqrt 8 $ and $ c = \exp(\sqrt 8 + 3) $. Try to implement this in a way that only evaluates $ \sqrt 8 $ once.

**Tip**: TensorFlow's API documentation is available at:
https://www.tensorflow.org/versions/master/api_docs/python/

In [None]:
# try yourself in the cells above, and do not read below this line!

### Solution 1

In [None]:
reset_graph()

my_graph = tf.Graph()
with my_graph.as_default():
    c = tf.exp(tf.sqrt(tf.constant(8.)) + tf.constant(3.))
    # actually, this also works:   c = tf.exp(tf.sqrt(8.) + 3.)

### Solution 1+2

In [None]:
reset_graph()

my_graph = tf.Graph()
with my_graph.as_default():
    c = tf.exp(tf.sqrt(tf.constant(8.)) + tf.constant(3.))

with tf.Session(graph=my_graph):
    c_val = c.eval()

c_val

### Solution 3

In [None]:
reset_graph()

my_graph = tf.Graph()
with my_graph.as_default():
    b = tf.sqrt(8.)
    c = tf.exp(b + 3)
    
with tf.Session(graph=my_graph) as sess:
    b_val, c_val = sess.run([b, c])
    
print(b_val)
print(c_val)

**Important**: a working but WRONG implementation is below. It gives the right result, but it runs the graph twice, once to evaluate `b`, and once to evaluate `c`.  Since `c` depends on `b`, it means that `b` will be evaluated twice. Not what we wanted.

In [None]:
# WRONG!
with tf.Session(graph=my_graph):
    b_val = b.eval()  # evaluates b
    c_val = c.eval()  # evaluates c, which means evaluating b again!

## END OF THE EXERCISE

## Display and inspect your graph

Display your graph (not mandatory to understand all code in the next cell.. just run it.. NOTE it works only on Chrome..)

In [None]:
import numpy as np
from IPython.display import display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = b"<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def=None, max_const_size=32):
    """Visualize TensorFlow graph."""
    graph_def = graph_def or tf.get_default_graph()
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [None]:
reset_graph()

my_first_graph = tf.Graph()
with my_first_graph.as_default():
    b = tf.sqrt(8.)
    c = tf.exp(b + 3)
    
with tf.Session(graph=my_first_graph) as sess:
    b_val, c_val = sess.run([b, c])
    
print(b_val)
print(c_val)

In [None]:
show_graph(my_first_graph)

Now try this second graph:

In [None]:
reset_graph()

my_second_graph = tf.get_default_graph()

# construction
x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

# execution
with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()

print(result)

In [None]:
show_graph(my_second_graph)

And re-try the same, as a third graph, with global variable initialisation:

In [None]:
reset_graph()

my_third_graph = tf.get_default_graph()

# construction
x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2

init = tf.global_variables_initializer() 

# execution
with tf.Session() as sess:
    init.run()
    result = f.eval()

print(result)

In [None]:
show_graph(my_third_graph)

# Recap

TensorFlow operations (aka **ops**) can take any number of inputs and produce any number of outputs. E.g. the addition and multiplication ops each take two inputs and produce one output. Constants and variables take no input (they are called _source ops_). 

The inputs and outputs are multidimensional arrays that flow through the computation graph, called **tensors**. Just like NumPy arrays, tensors have a type and a shape (in the Python API, tensors are simply represented by NumPy ndarrays). They typically contain floats, but you can also use them to carry strings (arbitrary byte arrays).