# AutoGraph

* 有三种计算图的构建方式：静态计算图，动态计算图，以及AutoGraph
* 动态计算图易于调试，编码效率较高，但执行效率偏低。
* 静态计算图执行效率很高，但较难调试

AutoGraph在TensorFlow 2.0通过@tf.function实现的

AutoGraph使用规范
* 被@tf.function修饰的函数应尽量使用TensorFlow中的函数而不是Python中的其他函数
* 避免在@tf.function修饰的函数内部定义tf.Variable
* 被@tf.function修饰的函数不可修改该函数外部的Python列表或字典等结构类型变量

被@tf.function修饰的函数应尽量使用TensorFlow中的函数而不是Python中的其他函数

In [6]:
import numpy as np
import tensorflow as tf

@tf.function
def np_random():
    a = np.random.randn(3, 3)
    tf.print(a)
    
@tf.function
def tf_random():
    a = tf.random.normal((3, 3))
    tf.print(a)

In [5]:
# 每次一样
np_random()
np_random()

array([[ 0.43121224, -0.84528546,  1.5944339 ],
       [-1.78089967,  0.31259289,  0.49794423],
       [ 0.95762154, -1.68846565,  0.97653595]])
array([[ 0.76924844,  1.66113809,  1.19434118],
       [ 1.1960669 ,  1.15551208, -0.62338377],
       [-0.51450426,  0.72633133, -1.72224607]])


In [3]:
tf_random()
tf_random()

[[-1.57202888 0.329783231 1.00998175]
 [0.562117875 0.14763622 0.156688228]
 [-0.681684375 1.07216585 0.377230406]]
[[-0.964217484 -0.670570135 2.49252486]
 [1.60961854 -0.121011332 -0.628118038]
 [-0.938082755 -1.69788432 -1.6997689]]


避免在@tf.function修饰的函数内部定义tf.Variable

In [7]:
x = tf.Variable(1.0, dtype=tf.float32)
@tf.function
def outer_var():
    x.assign_add(1.0)
    tf.print(x)
    return(x)

outer_var()
outer_var()

2
3


<tf.Tensor: shape=(), dtype=float32, numpy=3.0>

In [None]:
@tf.function
def inner_var():
    x = tf.Variable(1.0, dtype=tf.float32)
    x.assign_add(1.0)
    tf.print(x)
    return(x)
inner_var()

被@tf.function修饰的函数不可修改该函数外部的Python列表或字典等结构类型变量

In [12]:
tensor_list = []

@tf.function
def append_tensor(x):
    tensor_list.append(x)
    return tensor_list

append_tensor(tf.constant(5.0))
append_tensor(tf.constant(6.0))
print(tensor_list)

[<tf.Tensor 'x:0' shape=() dtype=float32>]


# AutoGraph 机制原理

In [17]:
import tensorflow as tf
import numpy as np

@tf.function(autograph=True)
def myadd(a, b):
    for i in tf.range(3):
        tf.print(i)
    c = a+b
    print("tracing")
    return c

In [18]:
myadd(tf.constant('hello'), tf.constant("world"))

tracing
0
1
2


<tf.Tensor: shape=(), dtype=string, numpy=b'helloworld'>

* 创建计算图
* 执行计算图

再次调用

In [15]:
myadd(tf.constant("great"), tf.constant("day"))

0
1
2


<tf.Tensor: shape=(), dtype=string, numpy=b'greatday'>

输入不同的参数类型

In [19]:
myadd(tf.constant(1), tf.constant(2))

tracing
0
1
2


<tf.Tensor: shape=(), dtype=int32, numpy=3>

如果每次输入参数不是Tensor，每次都会重新创建计算图

In [25]:
myadd("hello", "world")
myadd("good", "morning")

0
1
2
tracing
0
1
2


<tf.Tensor: shape=(), dtype=string, numpy=b'goodmorning'>

1. Python中的函数只在创建的时候使用
2. tf.Variable只会发生在第一步跟踪创建
3. 静态计算图是被编译成C++代码在TensorFlow内核中执行的。Python中的列表和字典等数据结构无法嵌入到计算图，只被读取，无法修改

# AutoGraph 使用案例

定义一个简单的函数

In [26]:
import tensorflow as tf
x = tf.Variable(1.0, dtype=tf.float32)

@tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
def add_print(a):
    x.assign_add(a)
    tf.print(x)
    return(x)

In [27]:
add_print(tf.constant(3.0))

4


<tf.Tensor: shape=(), dtype=float32, numpy=4.0>

利用tf.Module的子类化将其封装一下。

In [35]:
class DemoModule(tf.Module):
    def __init__(self, init_value = tf.constant(0.0), name = None):
        super(DemoModule, self).__init__(name=name)
        with self.name_scope:
            self.x = tf.Variable(init_value, dtype = tf.float32, trainable=True)
            
    @tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
    def addprint(self, a):
        with self.name_scope:
            self.x.assign_add(a)
            tf.print(self.x)
            return(self.x)

In [36]:
demo = DemoModule(init_value = tf.constant(1.0))
result = demo.addprint(tf.constant(5.0))

6


In [37]:
print(demo.variables)
print(demo.trainable_variables)

(<tf.Variable 'demo_module/Variable:0' shape=() dtype=float32, numpy=6.0>,)
(<tf.Variable 'demo_module/Variable:0' shape=() dtype=float32, numpy=6.0>,)


In [38]:
demo.submodules

()

In [39]:
tf.saved_model.save(demo, "./data/", signatures={"serving_default":demo.addprint})

INFO:tensorflow:Assets written to: ./data/assets


In [40]:
demo2 = tf.saved_model.load("./data/")
demo2.addprint(tf.constant(5.0))

11


<tf.Tensor: shape=(), dtype=float32, numpy=11.0>

In [41]:
!saved_model_cli show --dir ./data/ --all

Traceback (most recent call last):
  File "/Users/fatu/venv/tensorflow2py37/bin/saved_model_cli", line 5, in <module>
    from tensorflow.python.tools.saved_model_cli import main
ModuleNotFoundError: No module named 'tensorflow'


复杂一点的例子

In [46]:
import numpy as np
class MyModel(tf.keras.Model):
    def __init__(self, num_classes=10):
        super(MyModel, self).__init__(name='my_model')
        self.num_classes = num_classes
        self.dense_1 = tf.keras.layers.Dense(32, activation='relu')
        self.dense_2 = tf.keras.layers.Dense(num_classes)
            
    @tf.function(input_signature=[tf.TensorSpec([None, 32], tf.float32)])
    def call(self, inputs):
        x = self.dense_1(inputs)
        return self.dense_2(x)

In [47]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))


optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)

loss_fn = tf.keras.losses.CategoricalCrossentropy()

batch_size = 64
train_dataset = tf.data.Dataset.from_tensor_slices((data, labels))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)

In [48]:
model = MyModel(num_classes=10)

epochs = 3
for epoch in range(epochs):
    print('Start of epoch %d' % (epoch,))
    
    for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
        
        with tf.GradientTape() as tape:
            
            logits = model(x_batch_train, training=True)
            
            loss_value = loss_fn(y_batch_train, logits)
            
        grads = tape.gradient(loss_value, model.trainable_weights)
        
        optimizer.apply_gradients(zip(grads, model.trainable_weights))
        
        if step % 200 == 0:
            print('Training loss (for one batch) at step %s: %s' % (step, float(loss_value)))
            print('Seen so far: %s samples' % ((step + 1) * 64))

Start of epoch 0


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

Training loss (for one batch) at step 0: 24.87108612060547
Seen so far: 64 samples
Start of epoch 1
Training loss (for one batch) at step 0: 12.615829467773438
Seen so far: 64 samples
Start of epoch 2
Training loss (for one batch) at step 0: 12.293771743774414
Seen so far: 64 samples
