# Graphs and TF Functions

Resource: [TensorFlow Guide](https://www.tensorflow.org/guide/intro_to_graphs)

In TensorFlow, eager execution is a mode that allows you to run TensorFlow operations immediately as they are called from Python. This is in contrast to graph execution where tensor computations are executed as a TensorFlow graph, sometimes referred to as a tf.Graph or simply a “graph.”

Graph execution enables portability outside Python and tends to offer better performance. Eager execution simplifies the model building experience in TensorFlow, whereas graph execution can provide optimizations that make models run faster with better memory efficiency.

Graphs are data structures that contain a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations. They are defined in a tf.Graph context. Since these graphs are data structures, they can be saved, run, and restored all without the original Python code.

In [1]:
import tensorflow as tf
import timeit
from datetime import datetime

## Using Graphs
### Creating a graph with tf.function

In [2]:
def regular_function(x, y, b):
  x = tf.matmul(x, y)
  x = x + b
  return b

tf_function = tf.function(regular_function)

In [3]:
print(type(regular_function))
print(type(tf_function))

<class 'function'>
<class 'tensorflow.python.eager.polymorphic_function.polymorphic_function.Function'>


In [4]:
x1 = tf.constant([[1.0, 2.0]])
y1 = tf.constant([[2.0], [3.0]])
b1 = tf.constant(4.0)

In [5]:
res1 = regular_function(x1, y1, b1).numpy()
res2 = tf_function(x1, y1, b1).numpy()

assert res1 == res2

print(res1)
print(res2)

4.0
4.0


In [6]:
# graph-generating output of AutoGraph.
print(tf.autograph.to_code(regular_function))

def tf__regular_function(x, y, b):
    with ag__.FunctionScope('regular_function', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope:
        do_return = False
        retval_ = ag__.UndefinedReturnValue()
        x = ag__.converted_call(ag__.ld(tf).matmul, (ag__.ld(x), ag__.ld(y)), None, fscope)
        x = ag__.ld(x) + ag__.ld(b)
        try:
            do_return = True
            retval_ = ag__.ld(b)
        except:
            do_return = False
            raise
        return fscope.ret(retval_, do_return)



### Converting Python functions to graphs

While TensorFlow operations are easily captured by a tf.Graph, Python-specific logic (e.g, if-then clauses, loops, break, return, continue, and more) needs to undergo an extra step in order to become part of the graph. tf.function uses a library called AutoGraph (tf.autograph) to convert Python code into graph-generating code.

In [7]:
@tf.function
def tf_simple_relu(x):
  if tf.greater(x, 0):
    return x
  else:
    return 0

print("First branch, with graph:", tf_simple_relu(tf.constant(1)).numpy())
print("Second branch, with graph:", tf_simple_relu(tf.constant(-1)).numpy())

First branch, with graph: 1
Second branch, with graph: 0


In [8]:
def simple_relu(x):
  if tf.greater(x, 0):
    return x
  else:
    return 0

# graph-generating output of AutoGraph.
print(tf.autograph.to_code(simple_relu))

def tf__simple_relu(x):
    with ag__.FunctionScope('simple_relu', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope:
        do_return = False
        retval_ = ag__.UndefinedReturnValue()

        def get_state():
            return (do_return, retval_)

        def set_state(vars_):
            nonlocal retval_, do_return
            (do_return, retval_) = vars_

        def if_body():
            nonlocal retval_, do_return
            try:
                do_return = True
                retval_ = ag__.ld(x)
            except:
                do_return = False
                raise

        def else_body():
            nonlocal retval_, do_return
            try:
                do_return = True
                retval_ = 0
            except:
                do_return = False
                raise
        ag__.if_stmt(ag__.converted_call(ag__.ld(tf).greater, (ag__.ld(x), 0), None, fscope), if_bo

### Polymorphism: one Function, many graphs

A tf.Graph is specialized to a specific type of inputs (for example, tensors with a specific dtype or objects with the same id()).

Each time you invoke a Function with a set of arguments that can't be handled by any of its existing graphs (such as arguments with new dtypes or incompatible shapes), Function creates a new tf.Graph specialized to those new arguments. The type specification of a tf.Graph's inputs is known as its input signature or just a signature.

In [9]:
@tf.function
def my_relu(x):
  return tf.maximum(0., x)

print(my_relu(tf.constant(5.5)))
print(my_relu([1, -2]))
print(my_relu(tf.constant([3., -3.])))

tf.Tensor(5.5, shape=(), dtype=float32)
tf.Tensor([1. 0.], shape=(2,), dtype=float32)
tf.Tensor([3. 0.], shape=(2,), dtype=float32)


Because it's backed by multiple graphs, a Function is polymorphic. That enables it to support more input types than a single tf.Graph could represent, and to optimize each tf.Graph for better performance.

In [10]:
# There are three `ConcreteFunction`s (one for each graph) in `my_relu`.
# The `ConcreteFunction` also knows the return type and shape!
print(my_relu.pretty_printed_concrete_signatures())

my_relu(x)
  Args:
    x: float32 Tensor, shape=()
  Returns:
    float32 Tensor, shape=()

my_relu(x=[1, -2])
  Returns:
    float32 Tensor, shape=(2,)

my_relu(x)
  Args:
    x: float32 Tensor, shape=(2,)
  Returns:
    float32 Tensor, shape=(2,)


In [11]:
print(my_relu(tf.constant([-3., 3., -3.])))
print(my_relu(tf.Variable([[1., 2.], [4., 2.]])))



tf.Tensor([0. 3. 0.], shape=(3,), dtype=float32)
tf.Tensor(
[[1. 2.]
 [4. 2.]], shape=(2, 2), dtype=float32)


In [12]:
print(my_relu.pretty_printed_concrete_signatures())

my_relu(x)
  Args:
    x: float32 Tensor, shape=()
  Returns:
    float32 Tensor, shape=()

my_relu(x=[1, -2])
  Returns:
    float32 Tensor, shape=(2,)

my_relu(x)
  Args:
    x: float32 Tensor, shape=(2,)
  Returns:
    float32 Tensor, shape=(2,)

my_relu(x)
  Args:
    x: float32 Tensor, shape=(3,)
  Returns:
    float32 Tensor, shape=(3,)

my_relu(x)
  Args:
    x: VariableSpec(shape=(2, 2), dtype=tf.float32, trainable=True, alias_id=0)
  Returns:
    float32 Tensor, shape=(2, 2)


## Using tf.function Correctly
### Graph vs eager executions
The code in a Function can be executed both eagerly and as a graph. By default, Function executes its code as a graph.

In [13]:
@tf.function
def get_MSE(y_true, y_pred):
  sq_diff = tf.pow(y_true - y_pred, 2)
  return tf.reduce_mean(sq_diff)

In [14]:
y_true = tf.random.uniform([5], maxval=10, dtype=tf.int32)
y_pred = tf.random.uniform([5], maxval=10, dtype=tf.int32)

print(get_MSE(y_true, y_pred))

tf.Tensor(7, shape=(), dtype=int32)


In [15]:
# running the function eagerly
tf.config.run_functions_eagerly(True)
print(get_MSE(y_true, y_pred))

tf.config.run_functions_eagerly(False)

tf.Tensor(7, shape=(), dtype=int32)


Function can behave differently under graph and eager execution. The Python print function is one example of how these two modes differ. Let's check out what happens when you insert a print statement to your function and call it repeatedly.

In [16]:
@tf.function
def get_MSE(y_true, y_pred):
  print("Calculating MSE!")
  sq_diff = tf.pow(y_true - y_pred, 2)
  return tf.reduce_mean(sq_diff)

In [17]:
error1 = get_MSE(y_true, y_pred)
error2 = get_MSE(y_true, y_pred)
error3 = get_MSE(y_true, y_pred)

Calculating MSE!


In [18]:
print(error1)
print(error2)
print(error3)

tf.Tensor(7, shape=(), dtype=int32)
tf.Tensor(7, shape=(), dtype=int32)
tf.Tensor(7, shape=(), dtype=int32)


get_MSE only printed once even though it was called three times.

The print statement is executed when Function runs the original code in order to create the graph in a process known as "tracing"

Tracing captures the TensorFlow operations into a graph, and print is not captured in the graph. That graph is then executed for all three calls without ever running the Python code again.

## Non-strict Execution


### Eager execution steps through all program operations even when it is not needed

In [19]:
try:
  # this raises an error
  tf.gather(tf.constant([0.0]), [1])
except tf.errors.InvalidArgumentError as e:
  # All operations are run during eager execution so an error is raised.
  print(f'{type(e).__name__}: {e}')

InvalidArgumentError: {{function_node __wrapped__GatherV2_device_/job:localhost/replica:0/task:0/device:CPU:0}} indices[0] = 1 is not in [0, 1) [Op:GatherV2]


In [20]:
def unused_return_eager(x):
  # Get index 1 will fail when `len(x) == 1`
  tf.gather(x, [1]) # unused
  return x

try:
  print(unused_return_eager(tf.constant([0.0])))
except tf.errors.InvalidArgumentError as e:
  # All operations are run during eager execution so an error is raised.
  print(f'{type(e).__name__}: {e}')

InvalidArgumentError: {{function_node __wrapped__GatherV2_device_/job:localhost/replica:0/task:0/device:CPU:0}} indices[0] = 1 is not in [0, 1) [Op:GatherV2]


### Graph execution ignores unnecessary operations

In [21]:
@tf.function
def unused_return_graph(x):
  tf.gather(x, [1]) # unused
  return x

# Only needed operations are run during graph execution. The error is not raised.
print(unused_return_graph(tf.constant([0.0])))

tf.Tensor([0.], shape=(1,), dtype=float32)


In [22]:
def unused_return_graph(x):
  tf.print("this is the unused return graph function")
  tf.gather(x, [1]) # unused
  return x

tf_unused_return_graph = tf.function(unused_return_graph)

# Only needed operations are run during graph execution. The error is not raised.
print(tf_unused_return_graph(tf.constant([0.0])))

this is the unused return graph function
tf.Tensor([0.], shape=(1,), dtype=float32)


## tf.function Best Practices

Designing for tf.function may be your best bet for writing graph-compatible TensorFlow programs. Here are some tips:
* Toggle between eager and graph execution early and often with tf.config.run_functions_eagerly to pinpoint if/ when the two modes diverge.
* Create tf.Variables outside the Python function and modify them on the inside. The same goes for objects that use tf.Variable, like tf.keras.layers, tf.keras.Models and tf.keras.optimizers.
* Avoid writing functions that depend on outer Python variables, excluding tf.Variables and Keras objects. Learn more in Depending on Python global and free variables of the tf.function guide.
* Prefer to write functions which take tensors and other TensorFlow types as input. You can pass in other object types but be careful! Learn more in Depending on Python objects of the tf.function guide.
* Include as much computation as possible under a tf.function to maximize the performance gain. For example, decorate a whole training step or the entire training loop.

## Speeding up

tf.function usually improves the performance of your code, but the amount of speed-up depends on the kind of computation you run. Small computations can be dominated by the overhead of calling a graph.

tf.function is commonly used to speed up training loops.

You can also try tf.function(jit_compile=True) for a more significant performance boost, especially if your code is heavy on TensorFlow control flow and uses many small tensors.

In [23]:
def power(x, y):
  result = tf.eye(10, dtype=tf.dtypes.int32)
  for _ in range(y):
    result = tf.matmul(x, result)
  return result

In [24]:
x = tf.random.uniform(shape=[10, 10], minval=-1, maxval=2, dtype=tf.dtypes.int32)

In [25]:
execution_time_eager = timeit.timeit(lambda: power(x, 100), number=1000)
print("Eager execution:",
      execution_time_eager,
      "seconds")

Eager execution: 14.120101007000017 seconds


In [26]:
tf_power = tf.function(power)

In [27]:
execution_time_graph = timeit.timeit(lambda: tf_power(x, 100), number=1000)
print("Graph execution:",
      execution_time_graph,
      "seconds")

Graph execution: 1.0472153770000432 seconds


In [28]:
execution_time_eager / execution_time_graph

13.48347371239893

## Tracing

Graphs can speed up your code, but the process of creating them has some overhead. For some functions, the creation of the graph takes more time than the execution of the graph. This investment is usually quickly paid back with the performance boost of subsequent executions, but it's important to be aware that the first few steps of any large model training can be slower due to tracing.

No matter how large your model, you want to avoid tracing frequently. The tf.function guide discusses how to set input specifications and use tensor arguments to avoid retracing in the Controlling retracing section. If you find you are getting unusually poor performance, it's a good idea to check if you are retracing accidentally.

To figure out when your Function is tracing, add a print statement to its code. As a rule of thumb, Function will execute the print statement every time it traces.

In [29]:
@tf.function
def a_function_with_python_side_effect(x):
  print("Tracing!") # An eager-only side effect.
  return x * x + tf.constant(2)

# This is traced the first time.
print(a_function_with_python_side_effect(tf.constant(2)))
# The second time through, you won't see the side effect.
print(a_function_with_python_side_effect(tf.constant(3)))

Tracing!
tf.Tensor(6, shape=(), dtype=int32)
tf.Tensor(11, shape=(), dtype=int32)


In [30]:
# This retraces each time the Python argument changes,
# as a Python argument could be an epoch count or other
# hyperparameter.
print(a_function_with_python_side_effect(2))
print(a_function_with_python_side_effect(3))

Tracing!
tf.Tensor(6, shape=(), dtype=int32)
Tracing!
tf.Tensor(11, shape=(), dtype=int32)


In [31]:
print(a_function_with_python_side_effect(tf.constant([3])))
print(a_function_with_python_side_effect(tf.constant([3, 3])))



Tracing!
tf.Tensor([11], shape=(1,), dtype=int32)
Tracing!
tf.Tensor([11 11], shape=(2,), dtype=int32)
