Graphs and tf.functions ref: https://www.tensorflow.org/guide/intro_to_graphs

TensorFlow running eagerly means TensorFlow operations are executed by Python, operation by operation, and returning results back to Python. Eager TensorFlow takes advantage of GPUs, allowing you to place variables, tensors, and even operations on GPUs and TPUs. It is also easy to debug.

For some users, you may never need or want to leave Python.

However, running TensorFlow op-by-op in Python prevents a host of accelerations otherwise available. If you can extract tensor computations from Python, you can make them into a graph.

**Graphs are data structures that contain a set of `tf.Operation` objects, which represent units of computation; and `tf.Tensor` objects, which represent the units of data that flow between operations.** They are defined in a `tf.Graph` context. Since these graphs are data structures, they can be saved, run, and restored all without the original Python code.

**Tracing graphs**

The way you create a graph in TensorFlow is to use tf.function, either as a direct call or as a decorator.

In [None]:
import tensorflow as tf
import timeit
from datetime import datetime

In [None]:
# Define a Python function
def function_to_get_faster(x, y, b):
  x = tf.matmul(x, y)
  x = x + b
  return x

# Create a `Function` object that contains a graph
a_function_that_uses_a_graph = tf.function(function_to_get_faster)

# Make some tensors
x1 = tf.constant([[1.0, 2.0]])
y1 = tf.constant([[2.0], [3.0]])
b1 = tf.constant(4.0)

# It just works!
a_function_that_uses_a_graph(x1, y1, b1).numpy()

array([[12.]], dtype=float32)

tf.function-ized functions are Python callables that work the same as their Python equivalents. They have a particular class (python.eager.def_function.Function), but to you they act just as the non-traced version.

tf.function recursively traces any Python function it calls.

In [None]:
def inner_function(x, y, b):
  x = tf.matmul(x, y)
  x = x + b
  return x

# Use the decorator
@tf.function
def outer_function(x):
  y = tf.constant([[2.0], [3.0]])
  b = tf.constant(4.0)

  return inner_function(x, y, b)

# Note that the callable will create a graph that
# includes inner_function() as well as outer_function()
outer_function(tf.constant([[1.0, 2.0]])).numpy()

array([[12.]], dtype=float32)

**Flow control and side effects**

Flow control and loops are converted to TensorFlow via [tf.autograph](https://www.tensorflow.org/api_docs/python/tf/autograph) by default. Autograph uses a combination of methods, including standardizing loop constructs, unrolling, and [AST](https://docs.python.org/3/library/ast.html) manipulation.

In [None]:
def my_function(x):
  if tf.reduce_sum(x) <= 1:
    return x * x
  else:
    return x-1

a_function = tf.function(my_function)

print("First branch, with graph:", a_function(tf.constant(1.0)).numpy())
print("Second branch, with graph:", a_function(tf.constant([5.0, 5.0])).numpy())

First branch, with graph: 1.0
Second branch, with graph: [4. 4.]


You can directly call the Autograph conversion to see how Python is converted into TensorFlow ops. This is, mostly, unreadable, but you can see the transformation.

In [5]:
# Don't read the output too carefully.
print(tf.autograph.to_code(my_function))

def tf__my_function(x):
    with ag__.FunctionScope('my_function', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope:
        do_return = False
        retval_ = ag__.UndefinedReturnValue()

        def get_state():
            return (do_return, retval_)

        def set_state(vars_):
            nonlocal do_return, retval_
            (do_return, retval_) = vars_

        def if_body():
            nonlocal do_return, retval_
            try:
                do_return = True
                retval_ = (ag__.ld(x) * ag__.ld(x))
            except:
                do_return = False
                raise

        def else_body():
            nonlocal do_return, retval_
            try:
                do_return = True
                retval_ = (ag__.ld(x) - 1)
            except:
                do_return = False
                raise
        ag__.if_stmt((ag__.converted_call(ag__.ld(tf).reduce_sum, (ag

Autograph automatically converts if-then clauses, loops, break, return, continue, and more.

Most of the time, Autograph will work without special considerations. However, there are some caveats, and the [tf.function guide](https://www.tensorflow.org/guide/function) can help here, as well as the complete [autograph reference](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/index.md)

Seeing the speed up
Just wrapping a tensor-using function in [tf.function](https://www.tensorflow.org/api_docs/python/tf/function) does not automatically speed up your code. For small functions called a few times on a single machine, the overhead of calling a graph or graph fragment may dominate runtime. Also, if most of the computation was already happening on an accelerator, such as stacks of GPU-heavy convolutions, the graph speedup won't be large.

For complicated computations, graphs can provide a significant speedup. This is because graphs reduce the Python-to-device communication and perform some speedups.

The speedup is most obvious when running many small layers, as in the example below:

In [6]:
# Create an oveerride model to classify pictures
class SequentialModel(tf.keras.Model):
  def __init__(self, **kwargs):
    super(SequentialModel, self).__init__(**kwargs)
    self.flatten = tf.keras.layers.Flatten(input_shape=(28, 28))
    # Add a lot of small layers
    num_layers = 100
    self.my_layers = [tf.keras.layers.Dense(64, activation="relu")
                      for n in range(num_layers)]
    self.dropout = tf.keras.layers.Dropout(0.2)
    self.dense_2 = tf.keras.layers.Dense(10)

  def call(self, x):
    x = self.flatten(x)
    for layer in self.my_layers:
      x = layer(x)
    x = self.dropout(x)
    x = self.dense_2(x)
    return x

In [7]:
input_data = tf.random.uniform([20, 28, 28])

In [8]:
eager_model = SequentialModel()

# Don't count the time for the initial build.
eager_model(input_data)
print("Eager time:", timeit.timeit(lambda: eager_model(input_data), number=100))

Eager time: 1.7778153839999504


In [9]:
# Wrap the call method in a `tf.function`
graph_model = SequentialModel()
graph_model.call = tf.function(graph_model.call)

# Don't count the time for the initial build and trace.
graph_model(input_data)
print("Graph time:", timeit.timeit(lambda: graph_model(input_data), number=100))

Graph time: 0.17238710800029367


[Polymorphic Functions](https://www.tensorflow.org/guide/intro_to_graphs#polymorphic_functions)