# TensorFlow

TensorFlow is an extensive machine learning library built by Google, it is specialized for deep learning.

One of the advantages of TensorFlow is that it supports building end-to-end solutions. TensorFlow provides a variety of machine learning algorithms and models that can be used easily, and manages the execution of these algorithms during data preparation, training, evaluation, and deployment.

In particular, TensorFlow is compatible with a variety of CPU and GPU architectures, and can effectively manage clusters of computing resources, for purposes of replication and data-parallelization.

Additionally, TensorFlow provides compatible libraries for the web (Node.js and Browser), and mobile and IoT devices. It also has a large toolkit associated with it, that help developers visualize and validate their implmenetations. We will use one of these tools in this note, which visualizes TensorFlow graphs nicely.

Below we import the needed extensions to Jupyter Notebook, and use the appropriate magic functions, so that Jupyter can use the visualization tool.

In [1]:
%load_ext tensorboard

def visualize(graph, name):
    from tensorflow.python.summary.writer.writer import FileWriter
    FileWriter('logs/'+name, graph=graph).close()
    %tensorboard --logdir logs/$name

## TensorFlow Versions

TensorFlow has two main versions: v1, and v2. There are large differences between the two versions, from the perspective of users. V2 has a lot of improvements that make its API friendlier, and make it more efficient. However, both versions share a lot of design ideas behind the scenes.

We will begin by looking at v1 briefly, and then contrast it with some of the new features in v2, to help us understand how TensorFlow operate from the perspective of framework design and embedding.

We begin by importing a v1 compatible version of TensorFlow, and disabling eager execution (more on this later). This ensures that the semantics of TensorFlow is very close to v1.

In [2]:
# import tf with a v1 compatible API
import tensorflow.compat.v1 as tf 

# disable eager execution
tf.disable_eager_execution()

# in reality, we are still using v2
# but we will have the illusion of using v1
tf.version.VERSION

'2.0.0'

## TensorFlow V1: Manual Graph Definition

In V1, TensorFlow has two important concepts: **Graph** and **Session**.

A Graph is similar to a blueprint. It consists of a *flow* of instructions. TensorFlow can support having several graphs, each built seperately. Additionally, TensorFlow always has a default graph initialized, so that applications with a single graph does not need to manually intialize one.

In [3]:
g1 = tf.Graph()
g2 = tf.get_default_graph()

print('We have two different graphs now:', g1 is g2)
print('Both graphs are empty:', g1.get_operations(), g2.get_operations())

We have two different graphs now: False
Both graphs are empty: [] []


A graph can be assigned instructions in some order using tensorflow's API.

Graphs Do not contain real data in them. Instead, they are made out of symbolic handlers and symbolic operations on them.

In [4]:
g1 = tf.Graph()

with g1.as_default():
    print('Our graph is now the default graph:', g1 is tf.get_default_graph())
    print('')

    a = tf.constant(1.0)
    b = tf.constant(2.0)
    # constants do not expose their values during graph construction
    print('a, b =', a, b)

    c = tf.add(a, b) # very verbose
    c = a + b # fortunetly, TensorFlow uses operator overloading
    print(c) # + is not executed during construction

    print('\nThe graph:', g1)
    for op in g1.get_operations():
        print(op.name)

# the default graph has been reset, and it is still empty
print('\n', g2 is tf.get_default_graph(), g2.get_operations())
print(g2.get_operations())

Our graph is now the default graph: True

a, b = Tensor("Const:0", shape=(), dtype=float32) Tensor("Const_1:0", shape=(), dtype=float32)
Tensor("add_1:0", shape=(), dtype=float32)

The graph: <tensorflow.python.framework.ops.Graph object at 0x7f27532e4588>
Const
Const_1
Add
add_1

 True []
[]


We can use tensorflow tools and the Jupyter notebook extension to visualize our graph above.

In [None]:
visualize(g1, 'g1')

<!-- for exporting to HTML
![turned into static image due to exporting to HTML](static/images/tensorflow-graph1.png)
-->

After a graph is built, it can be run within a session. At that moment, the graph is instantiated with concrete values, and the operations are really run.

In [5]:
with tf.Session(graph=g1) as session:
    print('result:', session.run(c))
    print('a, b =', session.run(a), session.run(b))

result: 3.0
a, b = 1.0 2.0


In many ways, a graph is just a program, while a session is the act of running that program.

We can define more complicated graphs, that include loops, conditionals, and variables. All using TensorFlow's API. Additionlly, graph and session definition can be mixed together if desired.

In [7]:
g = tf.Graph()
with g.as_default():
    with tf.Session(graph=g) as session:
        t = tf.Variable(5.0, name='var')
        session.run(t.initializer)
    
        c = tf.constant(0.0)
        p = tf.placeholder('float')
    
        # this can be annoying when having several nested conditions or loops
        # body of every construct has to be a function or lambda
        t = tf.cond(t < p, lambda: t.assign(10), lambda: t.assign(0))
        result = tf.cond(tf.equal(t, c), lambda: tf.constant(1), lambda: tf.constant(0))

        print(session.run(result, {p: 8}))
        print(session.run(result, {p: 8}))
        print(session.run(result, {p: -1}))

0
1
1


An important thing to pay attention to is expressing the dependencies between operations. TensorFlow only runs the operations it thinks are required to produce the desired result, but not other operations. TensorFlow tracks these dependencies by tracking variable usage in the graph definition. Here is a demonstration:

In [8]:
g = tf.Graph()
with g.as_default():
    with tf.Session(graph=g) as session:
        t = tf.Variable(10.0, name='var')
        session.run(t.initializer)
    
        # so far, t has no operations associated to it
        print(session.run(t))

        # surprinsingly, this shows 10, this is because the assignment
        # operation was not executed! TensorFlow did not realize that it
        # is relevant
        t.assign(1)
        print(session.run(t))

        # now this works, since we are asking TensorFlow to run t AFTER
        # the assignment
        t = t.assign(1)
        print(session.run(t))

10.0
10.0
1.0


We can also visualize the resulting graph as before.

In [None]:
visualize(g, 'g')

<!-- for exporting to HTML
![turned into static image due to exporting to HTML](static/images/tensorflow-graph2.png)
-->

### Design Rationale

This seems complicated at first glance, why write a python program, that builds a graph (i.e. a tensorflow custom program), and then run it inside a session? Why not execute Tensorflow directly via python constructs? Indeed, TensorFlow v2 supports this more straightforward embedding.

However, the reason TensorFlow started out with this abstract and convoluted design is important, and very relevant to things we have already seen in class: Assume the additional layer of a graph was not used, and the program ran directly in Python. It will be very difficult for TensorFlow to have complete information about the entire program. Instead, it can only obserave the state of the program as it executed dynamically. This means that TensorFlow cannot know for sure what future instructions will be executed (since they have not been reached yet!), and cannot for sure know that running the program again will result in the same set of instructions (because of dynamic branching and looping).

TensorFlow requires this global picture of the program (or at least portions of it), since it is essential to being able to manage and distribute the operations of the program correctly and efficiently.

Therefore, TensorFlow had two options to get this global picture:
1. Implement some static code analyzer that can parse the user code and understand its structure.
2. Force users to specify their programs symbolically using TensorFlow's API.

As we have seen in our class, the first option can be very challenging to build accurately, and may run into undecidablities in the general case. It is burdensome to specify exactly which restrictions are imposed on user code to make analyzing it statically feasible for this application domain. Furthermore, it is difficult to communicate these restrictions effectively to the programmers, or enforce them by the static analyzer. This explains why TensorFlow chose the second approache.

## TensorFlow V2: Simplified Abstraction

TensorFlow v2 simplifies the programming abstraction, while keeping a lot of the performance benefits.

It re-designs how graphs are created, and how sessions are managed and run.

In particular, a TensorFlow computation using v2 can be defined in two ways:
1. **Eager Execution**: Computations expressed directly using the TensorFlow API run directly. This in effect means that the underlying graph is being executed eagerly as it is being constructed. This is enabled by default. While Eager execution may have some reduced performance in certain cases, it is very beneficial for quick prototyping and debugging.

2. **TensorFlow Function**: Graphs can be constructed automatically from Python functions. This is similar to v1, and maintains the performance advantages of having access to the entire graph statically. However, it is a lot cleaner than how v1 operates: the function can be evaluated without the need to use explicit sessions, and the body of the function can use more default python constructs and less of the TensorFlow API, especially for defining dependencies between operations.

We begin by importing the regular v2 TensorFlow API, and undoing our previous compatibility changes.

In [None]:
# Kill the kernel and all the changes made previously
import os
os._exit(00)

In [1]:
import tensorflow as tf # has eager execution enabled by default
tf.version.VERSION

'2.0.0'

### Eager Execution

Now, we will begin with something simple. We will create a couple of constants and add them together. Note that the output is immediately available.

In [2]:
a = tf.constant(1.0)
print(a) # note that a now has the value exposed

b = tf.constant(2.0)
c = a + b

print(c) # + is executed eagerly
print(c.numpy()) # the value as a numpy value

tf.Tensor(1.0, shape=(), dtype=float32)
tf.Tensor(3.0, shape=(), dtype=float32)
3.0


We can also do something more sophisticated. The below example is similar to one we have had before. Notice that because of eager execution, we can choose to encode conditionals and loops either using TensorFlow's API (e.g. using tf.cond) or via their Python constructs.

The two options have subtle differences. TensorFlow constructs are managed by TensorFlow, and should be used when the underlying operation is meant to be executed and managed by TensorFlow automatically, e.g. when executing on a cluster with some replication or data-parallelization. While the python constructs execute in the context of the running code, which is not managed by TensorFlow.

In [3]:
t = tf.Variable(5.0, name='var')
print(t) # note the value is there, no need to initialize

c = tf.constant(4.0)
cmp = t > c
print(cmp) # comparison is executed eagerly

# Perform checks using numpy API
print('')
for i in range(3):
    t = tf.cond(t > c, true_fn=lambda: t.assign(0), false_fn=lambda: t.assign(5))
    print(t, t.numpy())

# This also works (but not in general)
print('')
for i in range(3):
    cmp = (t > c).numpy()
    if cmp:
        t = t.assign(0)
    else:
        t = t.assign(5)
    print(t, t.numpy(), cmp)


<tf.Variable 'var:0' shape=() dtype=float32, numpy=5.0>
tf.Tensor(True, shape=(), dtype=bool)

<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=0.0> 0.0
<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=5.0> 5.0
<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=0.0> 0.0

<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=5.0> 5.0 False
<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=0.0> 0.0 True
<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=5.0> 5.0 False


Note that sessions and placeholders are now obsolute. There is no need to leave an explicit placeholder for values that will filled at run-time, since graph-construction-time and run-time are now both combined together. Similarly, there is no need to use a session to manage what is run. Instead, the statements are run immediately.

### TensorFlow Functions

Eager execution simplifies things a lot. But there are still advantages to having static graphs pre-constructed, and then executed on-demand. Primairly for efficiency. Since this allows TensorFlow to apply optimizations to the constructed graph, and find opportunities for parallelization. 

TensorFlow v2 supports this via TensorFlow functions. A tensorflow function is turned into a graph automatically. A call to the function is similar to running the underlying graph in a session. Parameters to a function are equivalent to placeholders, since they will only be given concrete values at run-time when the function is called.

In [4]:
@tf.function
def test(p):
    print('function called')
    v = p + 1
    if v > 3:
        return 0
    return v

# the function decorator morphes our function and uses it to construct a graph
print(test)
print('')

# test can be called without using sessions
r = test(1)
print(r, r.numpy())
r = test(3)
print(r, r.numpy())

<tensorflow.python.eager.def_function.Function object at 0x7fa80052d080>

function called
tf.Tensor(2, shape=(), dtype=int32) 2
function called
tf.Tensor(0, shape=(), dtype=int32) 0


TensorFlow transformed test into a TensorFlow function. This function manages when a graph is created from the function. Because of potential dependencies on the size of the input parameters, several calls to the function providing different parameters may result in creating several graphs. TensorFlow guarantees that calls passing inputs of the same type and shape (i.e. dimensionality) will not result in re-creation of graphs.

Under the hood, tf.function uses AutoGraph, a new feature in v2 that can transform python entities to graphs automatically.

To use proper terminology, using tf.function returns a **Polymorphic Python Function**, which encapsulates graph creation. The polymorphic function manages a cache, everytime this function is called with parameters that have either a new type or new shape, the function produces a concrete graph matching that shape, and stores it in the cache for future use.

This concrete graph is represented via a **Concrete Function**. We rarely need to manage or access these constructs ourselfs, since TensorFlow can manage them automatically. TensorFlow provides a way to get the underlying graph from a function, mainly for exporting and storing. We will use this feature to inspect the resulting graph and understand the implementation.

In [5]:
cf = test.get_concrete_function(tf.constant(3))
cf.graph.get_operations()

function called


[<tf.Operation 'p' type=Placeholder>,
 <tf.Operation 'add/y' type=Const>,
 <tf.Operation 'add' type=AddV2>,
 <tf.Operation 'Greater/y' type=Const>,
 <tf.Operation 'Greater' type=Greater>,
 <tf.Operation 'cond' type=StatelessIf>,
 <tf.Operation 'cond/Identity' type=Identity>,
 <tf.Operation 'cond/Identity_1' type=Identity>,
 <tf.Operation 'Identity' type=Identity>]

Note that the function gets called when we attempt to concretize it. In fact, the function is passed the parameters we specified. Since Tensor objects (such as tf.constant) are lazy by nature, they can be used to delay execution of the operations inside the function, and instead trace the operations inside the function, and produce the abstract graph.

This mechanism is very similar to the bonus part in project 2!


### Functions with Side Effects

A nice property of v1 sessions is that their effects can survive between runs. This gives the programming model extra richness, since it allows state.

This can be achieved with TensorFlow functions as well. Global variables (i.e. variables defined outside the function) survive function execution, and changes to them are reflected to future function calls. Using global variables can give us state.

However, implementing such side effects must be done very carefully. TensorFlow functions may be executed in parallel and are generally managed by TensorFlow. The function may run in a different context, may execute operations out of order, and may change certain operations for optimizations.

To ensure correctness, side effects must therefore be also managed by TensorFlow. Any modified global variable must be a Tensor itself, such as a TensorFlow variable. Below is an example.

In [6]:
v = tf.Variable(0, name='v')

@tf.function
def func(p):
    v.assign_add(p)
    return v

print(func(5).numpy())
print(func(10).numpy())
print(func(3).numpy())

5
15
18


# Additional Resources

To expand on some of the programming abstractions described above:
1. Eager execution: [https://www.tensorflow.org/guide/eager](https://www.tensorflow.org/guide/eager).
2. TensorFlow Functions: [https://www.tensorflow.org/guide/function](https://www.tensorflow.org/guide/function).
3. Distributed Training and parallelization strategies: [https://www.tensorflow.org/guide/distributed_training](https://www.tensorflow.org/guide/distributed_training).

For some behind the scenese and design explanations:
1. Concrete and polymorphic functions [https://www.tensorflow.org/lite/convert/concrete_function](https://www.tensorflow.org/lite/convert/concrete_function).
2. Autograph: [https://www.tensorflow.org/api_docs/python/tf/autograph](https://www.tensorflow.org/api_docs/python/tf/autograph).
3. This [Medium Article](https://towardsdatascience.com/tensorflow-control-flow-tf-cond-903e020e722a) on how cond and while_loop work behind the scenes.

For complete tutorials into using TensorFlow:
1. [10 minutes tutorial](https://cv-tricks.com/artificial-intelligence/deep-learning/deep-learning-frameworks/tensorflow-tutorial/) for TensorFlow V1.
2. [Complete Examples](https://github.com/aymericdamien/TensorFlow-Examples/tree/master/tensorflow_v2) of using TensorFlow 2 for machine learning.
