# Tensorflow AutoGraph

Here we introduce AutoGraph, that is to convert a python function into a Tensorflow computational graph, so the computation can be performed more efficiently. 

In [1]:
import numpy as np
import tensorflow as tf
from pprint import pprint

## 1 tf.function  

`tf.function` converts a python function into a function that executes a tensorflow computational graph when called.

In [2]:
def func(x):
    return (2 + 1) * x + 2 + 1

tf_func = tf.function(func)
tf_func

<tensorflow.python.eager.def_function.Function at 0x7f114c4fea58>

In [3]:
tf_func(2)

<tf.Tensor: id=6, shape=(), dtype=int32, numpy=9>

In [4]:
tf_func(tf.constant([2, 3]))

<tf.Tensor: id=18, shape=(2,), dtype=int32, numpy=array([ 9, 12], dtype=int32)>

`tf.function` is also a decorator.

In [5]:
@tf.function
def func(x):
    for i in tf.range(3):
        x = x ** (1 + 1)
    return x

func

<tensorflow.python.eager.def_function.Function at 0x7f11400c1e80>

In [6]:
func(2)

<tf.Tensor: id=69, shape=(), dtype=int32, numpy=256>

In [7]:
func(tf.constant([2, 3]))

<tf.Tensor: id=120, shape=(2,), dtype=int32, numpy=array([ 256, 6561], dtype=int32)>

If you do so wish, you can take a sneak peak at the computation graph tensorflow created under the hood 

In [8]:
pprint(tf.autograph.to_code(func.python_function))

('def tf__func(x):\n'
 '  do_return = False\n'
 '  retval_ = ag__.UndefinedReturnValue()\n'
 "  with ag__.FunctionScope('func', 'func_scope', "
 'ag__.ConversionOptions(recursive=True, user_requested=True, '
 'optional_features=(), internal_convert_user_code=True)) as func_scope:\n'
 '\n'
 '    def get_state():\n'
 '      return ()\n'
 '\n'
 '    def set_state(_):\n'
 '      pass\n'
 '\n'
 '    def loop_body(iterates, x):\n'
 '      i = iterates\n'
 '      x = x ** (1 + 1)\n'
 '      return x,\n'
 '    x, = ag__.for_stmt(ag__.converted_call(tf.range, func_scope.callopts, '
 '(3,), None, func_scope), None, loop_body, get_state, set_state, (x,), '
 "('x',), ())\n"
 '    do_return = True\n'
 '    retval_ = func_scope.mark_return_value(x)\n'
 '  do_return,\n'
 '  return ag__.retval(retval_)\n')


## 2 Caveats when using tf.function

Without go in detail about how tf.function works, one can guess that tensorflow needs to take a look at the inner function once(`trace`) to figure out how to put it on graph and optimize a bit. But probably it is only doing that for the very first time. Also, obviously not every arbitrary python code can tensorflow understand and convert to graphs. So, here are few caveats when working with tf.function. 

### Only use Tensorflow operations inside tf.function when possible

In [9]:
@tf.function
def add_noise(x):
    e = np.random.random(1)
    return x + e

for i in range(5):
    print(add_noise(1))

tf.Tensor([1.8052909], shape=(1,), dtype=float64)
tf.Tensor([1.8052909], shape=(1,), dtype=float64)
tf.Tensor([1.8052909], shape=(1,), dtype=float64)
tf.Tensor([1.8052909], shape=(1,), dtype=float64)
tf.Tensor([1.8052909], shape=(1,), dtype=float64)


In [10]:
@tf.function
def add_noise(x):
    e = tf.random.uniform((1,))
    return x + e

for i in range(5):
    print(add_noise(1))

tf.Tensor([1.6505606], shape=(1,), dtype=float32)
tf.Tensor([1.7381033], shape=(1,), dtype=float32)
tf.Tensor([1.0657276], shape=(1,), dtype=float32)
tf.Tensor([1.635862], shape=(1,), dtype=float32)
tf.Tensor([1.8456978], shape=(1,), dtype=float32)


So in the first `add_noise` function, the random noise term is actually recorded as a constant in the computation graph during `tracing`.

If it indeed neccessary to have non-tensorflow operations inside `tf.funcion`, we can wrap the non-tensorflow operations in `tf.py_function`. This wrapped non-tensorflow code is ofcourse not optimized by tensorflow for obvious reasons.

In [11]:
@tf.function
def add_noise(x):
    e = tf.py_function(np.random.random, [1], tf.float32) 
    return x + e

for i in range(5):
    print(add_noise(1))

tf.Tensor([1.9542024], shape=(1,), dtype=float32)
tf.Tensor([1.1382076], shape=(1,), dtype=float32)
tf.Tensor([1.8553867], shape=(1,), dtype=float32)
tf.Tensor([1.0616622], shape=(1,), dtype=float32)
tf.Tensor([1.3642895], shape=(1,), dtype=float32)


### Watch out for side effects

In [12]:
@tf.function
def square(x):
    squared = tf.square(x)
    print(squared)
    return squared

x = tf.Variable(0)
for i in range(5):
    x.assign(i)
    square(x)

Tensor("Square:0", shape=(), dtype=int32)


Notice there is only one printout in total 5 function calls. The `print` statement is only executed in the first function call.

A walk around again is to use `tf.py_function` to wrap the arbitrary python code. 

In [13]:
@tf.function
def square(x):
    squared = tf.square(x)
    tf.py_function(print, [squared], [])
    return squared

x = tf.Variable(0)
for i in range(5):
    x.assign(i)
    square(x)

tf.Tensor(0, shape=(), dtype=int32)
tf.Tensor(1, shape=(), dtype=int32)
tf.Tensor(4, shape=(), dtype=int32)
tf.Tensor(9, shape=(), dtype=int32)
tf.Tensor(16, shape=(), dtype=int32)


or you can use `tf.print` ...

In [14]:
@tf.function
def square(x):
    squared = tf.square(x)
    tf.print(squared)
    return squared

x = tf.Variable(0)
for i in range(5):
    x.assign(i)
    square(x)

0
1
4
9
16


### Not always faster!

Note that even with a function that performs pure tensorflow operations on tensorflow tensors, it is not neccessarily the case that convert it into a graph would give you speed up. On small and simple operations, the graph mode seems to be slower due to some overhead? 

But with more complexity and larger data sizes, build graphs with tf.function is definitely faster. 

In [15]:
def forward_pass(x, y, weights):
    y_hat = tf.matmul(x, weights)
    mse = tf.reduce_mean(tf.square(y - y_hat))
    return mse

@tf.function
def tf_forward_pass(x, y, weights):
    y_hat = tf.matmul(x, weights)
    mse = tf.reduce_mean(tf.square(y - y_hat))
    return mse

x = tf.constant(tf.random.uniform((1000, 100)))
y = tf.constant(tf.random.uniform((1000, 1)))
weights = tf.constant(tf.random.uniform((100, 1)))

assert forward_pass(x, y, weights) == tf_forward_pass(x, y, weights)

In [16]:
%timeit forward_pass(x, y, weights)

186 µs ± 1 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [17]:
%timeit tf_forward_pass(x, y, weights)

215 µs ± 3.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [18]:
x = tf.constant(tf.random.uniform((1000000, 100)))
y = tf.constant(tf.random.uniform((1000000, 1)))
weights = tf.constant(tf.random.uniform((100, 1)))

assert forward_pass(x, y, weights) == tf_forward_pass(x, y, weights)

In [19]:
%timeit forward_pass(x, y, weights)

1.03 ms ± 17.7 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [20]:
%timeit tf_forward_pass(x, y, weights)

1.01 ms ± 71.4 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [21]:
def forward_pass(x, y, weights):
    for i in range(1000):
        weights = weights * weights
    y_hat = tf.matmul(x, weights)
    mse = tf.reduce_mean(tf.square(y - y_hat))
    return mse

@tf.function
def tf_forward_pass(x, y, weights):
    for i in tf.range(1000):
        weights = tf.square(weights)
    y_hat = tf.matmul(x, weights)
    mse = tf.reduce_mean(tf.square(y - y_hat))
    return mse

x = tf.constant(tf.random.uniform((1000, 100)))
y = tf.constant(tf.random.uniform((1000, 1)))
weights = tf.constant(tf.random.uniform((100, 1)))

assert forward_pass(x, y, weights) == tf_forward_pass(x, y, weights)

In [22]:
%timeit forward_pass(x, y, weights)

39 ms ± 152 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [23]:
%timeit tf_forward_pass(x, y, weights)

25 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
