<a href="https://colab.research.google.com/github/victorviro/Deep_learning_python/blob/master/TensorFlow_Eager_vs_Graph_execution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TensorFlow Eager vs Graph execution

# Table of contents

1. [Introduction](#1)
2. [Eager execution](#2)
3. [Graph execution](#3)
    1. [Graphs](#3.1)
    2. [Benefits of graphs](#3.2)
    3. [tf.function](#3.3)
    4. [Polymorphism](#3.4)
    5. [Graph vs. eager execution](#3.5)
    6. [Speed-up](#3.6)
4. [References and further reading](#4)


# Introduction <a name="1"></a>

In this notebook, we will talk about the two modes in TensorFlow: the eager execution mode and the graph mode, as well as its pros and cons.

In [None]:
import tensorflow as tf

# Eager execution <a name="2"></a>

TensorFlow's eager execution is an imperative programming environment that evaluates operations immediately, without building graphs: **operations return concrete values instead of constructing a computational graph to run later**. This makes it **easy to get started with TensorFlow and debug models**, but it is **not necessarily suggested for real training or production**. 

Eager execution supports most TensorFlow operations and GPU acceleration. In Tensorflow 2.0, eager execution is enabled by default:

In [None]:
tf.executing_eagerly()

True

We can run TensorFlow operations and the results will return immediately. This is what we usually expect in running Python codes. Codes are executed line by line with computation results returned immediately.

In [None]:
x = [[2.]]
m = tf.matmul(x, x)
print(m)

tf.Tensor([[4.]], shape=(1, 1), dtype=float32)


With eager execution TensorFlow operations immediately evaluate and return their values to Python. `tf.Tensor` objects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn't a computational graph to build and run later in a session, it's easy to inspect results using `print` statements or a debugger.

# Graph execution <a name="3"></a>

While eager execution has several unique advantages, graph execution enables portability outside Python and tends to offer better performance. Graph execution means that tensor computations are executed as a **TensorFlow graph**, sometimes referred to as a [`tf.Graph`](https://www.tensorflow.org/api_docs/python/tf/Graph) or simply a "graph."

## Graphs <a name="3.1"></a>

Graphs are data structures that contain a set of [`tf.Operation`](https://www.tensorflow.org/api_docs/python/tf/Operation) objects, which represent units of computation or nodes in the graph; and [`tf.Tensor`](https://www.tensorflow.org/api_docs/python/tf/Tensor) objects, which represent the units of data that flow between operations. They are defined in a `tf.Graph` context. Since these graphs are data structures, they can be saved, run, and restored all without the original Python code.

This is what a TensorFlow graph representing a two-layer neural network looks like when visualized in TensorBoard.

![](https://i.ibb.co/7pZ6qmc/tf-graph.png)

## The benefits of graphs <a name="3.2"></a>

- **Speed**: Graphs are easily [optimized](https://www.tensorflow.org/guide/graph_optimization) to improve execution performance.

- **Portability/Deployability**: Models can run efficiently on multiple devices that don't have a Python interpreter (like mobile applications, embedded devices, and backend servers). In fact, TensorFlow uses graphs as the format for saved models when it exports them from Python. This portability has a great advantage in production deployment. By export a `SavedModel` including data preprocessing, we eliminate possible mistakes in re-creating the data preprocessing logic in production.

However, we still want to define our machine learning models (or other computations) in Python for convenience, and then automatically construct graphs when we need them.

## `tf.function` <a name="3.3"></a>

We create and run a graph in TensorFlow by using [`tf.function`](https://www.tensorflow.org/api_docs/python/tf/function), either as a direct call or as a decorator. `tf.function` takes a regular function as input and returns a TensorFlow `Function`, which is a Python callable that builds TensorFlow graphs from the Python function.

In [None]:
# Define a Python function
def a_regular_function(x, y):
  x = tf.matmul(x, y)
  return x
# Create a TensorFlow `Function`
a_function_that_uses_a_graph = tf.function(a_regular_function)

x = tf.constant([[1.0, 2.0]])
y = tf.constant([[2.0], [3.0]])

print(a_regular_function(x, y).numpy())
print(a_function_that_uses_a_graph(x, y).numpy())

[[8.]]
[[8.]]


On the outside, a `Function` looks like a regular function we write using TensorFlow operations. Underneath, however, it's different. A `Function` encapsulates several `tf.Graphs` behind one API. 

**Note**: `tf.function` applies to a function and all other functions it calls.

Any function we write will contain a mixture of built-in TF operations and Python logic, such as `if-then` clauses, loops, `return`, etc. While TensorFlow operations are easily captured by a `tf.Graph`, Python-specific logic needs to undergo an extra step in order to become part of the graph. `tf.function` uses a library, [`tf.autograph`](https://www.tensorflow.org/api_docs/python/tf/autograph), to convert Python code into graph-generating code. Though it is unlikely that we'll need to view graphs directly, we can inspect the outputs:

In [None]:
# Graph-generating output of AutoGraph
print(tf.autograph.to_code(a_regular_function))
# The graph itself
print(a_function_that_uses_a_graph.get_concrete_function(x,y).graph.as_graph_def())

It's recommendable to **include as much computation as possible under a `tf.function` to maximize the performance gain**. For example, decorate a whole **training step** or the entire training loop.

## Polymorphism <a name="3.4"></a>

A `tf.Graph` is specialized to a specific type of inputs (for example, tensors with a specific `dtype`). Each time we invoke a `Function` with new `dtypes` and shapes in its arguments, `Function` creates a new `tf.Graph` for the new arguments. The `dtypes` and shapes of a `tf.Graph`'s inputs are known as an **input signature** or just a signature.

The `Function` stores the `tf.Graph` corresponding to that signature in a `ConcreteFunction`, which is a wrapper around a `tf.Graph`.

In [None]:
@tf.function
def my_relu(x):
  return tf.maximum(0., x)

# `my_relu` creates new graphs as it observes more signatures
print(my_relu(tf.constant(5.5)))
print(my_relu([1, -1]))

tf.Tensor(5.5, shape=(), dtype=float32)
tf.Tensor([1. 0.], shape=(2,), dtype=float32)


In [None]:
# There are two `ConcreteFunction`s (one for each graph) in `my_relu`
# The `ConcreteFunction` also knows the return type and shape!
print(my_relu.pretty_printed_concrete_signatures())

my_relu(x)
  Args:
    x: float32 Tensor, shape=()
  Returns:
    float32 Tensor, shape=()

my_relu(x=[1, -1])
  Returns:
    float32 Tensor, shape=(2,)


If the `Function` has already been called with that signature, `Function` does not create a new `tf.Graph`.

Because it's backed by **multiple graphs**, a `Function` is **polymorphic**. That enables it to support more input types than a single `tf.Graph` could represent, as well as to optimize each `tf.Graph` for better performance.

## Graph execution vs. eager execution <a name="3.5"></a>

The code in a `Function` can be executed both eagerly and as a graph. By default, `Function` executes its code as a graph. To verify that our Function's graph is doing the same computation as its equivalent Python function, we can make it execute eagerly with `tf.config.run_functions_eagerly(True)`. This is a switch that turns off `Function`'s ability to create and run graphs, and instead executing the code eagerly.

In [None]:
tf.config.run_functions_eagerly(True)
print(my_relu([1, -1]))
tf.config.run_functions_eagerly(False)

tf.Tensor([1. 0.], shape=(2,), dtype=float32)


**`Function` can behave differently under graph and eager execution**. The Python `print` function is one example . Let's check out what happens when we insert a `print` statement to our function and call it repeatedly.

In [None]:
@tf.function
def my_relu(x):
    print("Applying relu!")
    return tf.maximum(0., x)

activation = my_relu([1, -1])
activation = my_relu([1, -1])

Applying relu!


`my_relu` only printed once even though it was called two times. This is caused by that the `print` statement is executed when `Function` runs the original code to create the graph in a process known as ["tracing"](https://www.tensorflow.org/guide/function#tracing). Tracing captures the TensorFlow operations into a graph, and `print` is not captured in the graph. That graph is then executed for all two calls without ever running the Python code again.

As a sanity check, let's turn off graph execution to compare:

In [None]:
tf.config.run_functions_eagerly(True)
activation = my_relu([1, -1])
activation = my_relu([1, -1])
tf.config.run_functions_eagerly(False)

Applying relu!
Applying relu!


`print` is a *Python side effect*, and there are [other differences](https://www.tensorflow.org/guide/function#executing_python_side_effects) to be aware of when converting a function into a `Function`.

TensorFlow documentation suggests to first-time users to play around with decorating toy functions with `@tf.function` to get experience with going from eager to graph execution. [Here](https://www.tensorflow.org/guide/intro_to_graphs#tffunction_best_practices) are some tips.

## Speed-up <a name="3.6"></a>

Graphs can speed up your code, but the process of creating them has some overhead. For some functions, the creation of the graph takes more time than the execution of the graph. This investment is usually quickly paid back with the performance boost of subsequent executions. The first few steps of any large model training can be slower due to tracing.

No matter how large our model is, we want to avoid tracing frequently.  To figure out when our `Function` is tracing, we can add a `print` statement to its code. An illustration is available [here](https://www.tensorflow.org/guide/intro_to_graphs#when_is_a_function_tracing). 

# References and further reading <a name="4"></a>

- [TensorFlow docs: Eager execution](https://www.tensorflow.org/guide/eager)

- [Introduction to graphs and tf.function](https://www.tensorflow.org/guide/intro_to_graphs)

- [Better performance with tf.function](https://www.tensorflow.org/guide/function)

- [AutoGraph reference](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/index.md)

- [When is a `Function` tracing?](https://www.tensorflow.org/guide/intro_to_graphs#when_is_a_function_tracing)