## Types of Optimizations Applied for Inference
* Remove training-only operations (checkpoint saving, drop out)
* Strip out unreachable nodes
* Remove debug operations (CheckNumerics)
* Fold batch normalization Ops into the pre-calculated weights (super cool)
* Fuse adjacent operators

## Graph Transform Tool

https://petewarden.com/2016/12/30/rewriting-tensorflow-graphs-with-the-gtt/

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms

## Summarize Model

In [59]:
!which summarize_graph

/root/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph


In [60]:
!ls -l /root/models/optimize_me/

total 116
-rw-r--r-- 1 root root  7341 Mar 30 04:45 obfuscated_names_cpu.pb
-rw-r--r-- 1 root root   307 Mar 30 04:44 optimized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:45 quantized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:45 rounded_weights_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 03:47 unoptimized_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 04:09 unoptimized_gpu.pb


In [61]:
!summarize_graph --in_graph=/root/models/optimize_me/unoptimized_cpu.pb

Found 2 possible inputs: (name=x_observed, type=float(1), shape=[]) (name=y_observed, type=float(1), shape=[]) 
Found 2 variables: (name=weights, type=float(1), shape=[]) (name=bias, type=float(1), shape=[]) 
Found 4 possible outputs: (name=gradients/sub_grad/tuple/control_dependency_1, op=Identity) (name=gradients/mul_grad/tuple/control_dependency_1, op=Identity) (name=Merge/MergeSummary, op=MergeSummary) (name=save/Identity, op=Identity) 
2017-03-30 04:50:36.965244: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodeweights
2017-03-30 04:50:36.965378: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodebias
Found 25 (25) const parameters, 0 (0) variable parameters, and 24 control_edges
92 nodes assigned to device '/device:CPU:0'Op types used: 29 Const, 10 Identity, 7 NoOp, 7 Mul, 7 Reshape, 6 Sum, 6 Shape, 4 Assign, 3 Add, 3 Sub, 3 BroadcastGradientArgs, 2 RestoreV2, 2 VariableV2, 2 RandomU

In [62]:
!benchmark_model --graph=/root/models/optimize_me/unoptimized_cpu.pb --show_flops --input_layer=x_observed,y_observed,bias,weights --input_layer_type=float,float,float,float --input_layer_shape=::: --output_layer=add

/bin/sh: 1: benchmark_model: not found


In [63]:
!transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_cpu.pb \
--out_graph=/root/models/optimize_me/optimized_cpu.pb \
--inputs='x_observed,y_observed,weights,bias' \
--outputs='add' \
--transforms='strip_unused_nodes(type=float, shape="1,299,299,3") \
remove_nodes(op=Identity, op=CheckNumerics) \
fold_constants(ignore_errors=true) \
fold_batch_norms \
fold_old_batch_norms'

2017-03-30 04:50:37.510825: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying strip_unused_nodes
2017-03-30 04:50:37.511205: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying remove_nodes
2017-03-30 04:50:37.511708: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_constants
2017-03-30 04:50:37.526511: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_batch_norms
2017-03-30 04:50:37.526713: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_old_batch_norms


In [64]:
!summarize_graph --in_graph=/root/models/optimize_me/optimized_cpu.pb

Found 3 possible inputs: (name=bias, type=float(1), shape=[1,299,299,3]) (name=weights, type=float(1), shape=[1,299,299,3]) (name=x_observed, type=float(1), shape=[]) 
No variables spotted.
Found 1 possible outputs: (name=add, op=Add) 
Found 0 (0) const parameters, 0 (0) variable parameters, and 0 control_edges
3 nodes assigned to device '/device:CPU:0'Op types used: 3 Placeholder, 1 Add, 1 Mul
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/root/models/optimize_me/optimized_cpu.pb --show_flops --logtostderr --input_layer=bias,weights,x_observed --input_layer_type=float,float,float --input_layer_shape=1,299,299,3:1,299,299,3: --output_layer=add


In [65]:
!benchmark_model --graph=/root/models/optimize_me/optimized_cpu.pb --show_flops --input_layer=x_observed,y_observed,weights,bias --input_layer_type=float,float,float,float --input_layer_shape=::: --output_layer=Merge/MergeSummary

/bin/sh: 1: benchmark_model: not found


In [66]:
!transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_cpu.pb \
--out_graph=/root/models/optimize_me/rounded_weights_cpu.pb \
--inputs='x_observed,y_observed,weights,bias' \
--outputs='add' \
--transforms=' \
round_weights(num_steps=256)'

2017-03-30 04:50:38.362927: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying round_weights


In [67]:
!summarize_graph --in_graph=/root/models/optimize_me/rounded_weights_cpu.pb

Found 2 possible inputs: (name=x_observed, type=float(1), shape=[]) (name=y_observed, type=float(1), shape=[]) 
Found 2 variables: (name=weights, type=float(1), shape=[]) (name=bias, type=float(1), shape=[]) 
Found 4 possible outputs: (name=gradients/sub_grad/tuple/control_dependency_1, op=Identity) (name=gradients/mul_grad/tuple/control_dependency_1, op=Identity) (name=Merge/MergeSummary, op=MergeSummary) (name=save/Identity, op=Identity) 
2017-03-30 04:50:38.641961: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodeweights
2017-03-30 04:50:38.642051: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodebias
Found 25 (25) const parameters, 0 (0) variable parameters, and 24 control_edges
92 nodes assigned to device '/device:CPU:0'Op types used: 29 Const, 10 Identity, 7 NoOp, 7 Mul, 7 Reshape, 6 Sum, 6 Shape, 4 Assign, 3 Add, 3 Sub, 3 BroadcastGradientArgs, 2 RestoreV2, 2 VariableV2, 2 RandomU

In [68]:
!ls -l /root/models/optimize_me/

total 116
-rw-r--r-- 1 root root  7341 Mar 30 04:45 obfuscated_names_cpu.pb
-rw-r--r-- 1 root root   307 Mar 30 04:50 optimized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:45 quantized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:50 rounded_weights_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 03:47 unoptimized_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 04:09 unoptimized_gpu.pb


In [69]:
!transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_cpu.pb \
--out_graph=/root/models/optimize_me/quantized_cpu.pb \
--inputs='x_observed,y_observed,weights,add' \
--outputs='add' \
--transforms=' \
quantize_weights'

2017-03-30 04:50:39.221469: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying quantize_weights


In [70]:
!ls -l /root/models/optimize_me/

total 116
-rw-r--r-- 1 root root  7341 Mar 30 04:45 obfuscated_names_cpu.pb
-rw-r--r-- 1 root root   307 Mar 30 04:50 optimized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:50 quantized_cpu.pb
-rw-r--r-- 1 root root 13361 Mar 30 04:50 rounded_weights_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 03:47 unoptimized_cpu.pb
-rw-r--r-- 1 root root 34062 Mar 30 04:09 unoptimized_gpu.pb


In [71]:
!transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_cpu.pb \
--out_graph=/root/models/optimize_me/obfuscated_names_cpu.pb \
--inputs='x_observed,y_observed,weights,bias' \
--outputs='add' \
--transforms=' \
obfuscate_names'

2017-03-30 04:50:39.776747: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying obfuscate_names


In [72]:
!summarize_graph --in_graph=/root/models/optimize_me/obfuscated_names_cpu.pb

Found 2 possible inputs: (name=x_observed, type=float(1), shape=[]) (name=y_observed, type=float(1), shape=[]) 
Found 2 variables: (name=weights, type=float(1), shape=[]) (name=bias, type=float(1), shape=[]) 
Found 4 possible outputs: (name=S, op=Identity) (name=1e, op=Identity) (name=1m, op=MergeSummary) (name=1z, op=Identity) 
2017-03-30 04:50:40.052945: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodeweights
2017-03-30 04:50:40.053027: W tensorflow/tools/graph_transforms/summarize_graph_main.cc:183] Decoding Tensor failed for nodebias
Found 25 (25) const parameters, 0 (0) variable parameters, and 24 control_edges
92 nodes assigned to device '/device:CPU:0'Op types used: 29 Const, 10 Identity, 7 NoOp, 7 Mul, 7 Reshape, 6 Sum, 6 Shape, 4 Assign, 3 Add, 3 Sub, 3 BroadcastGradientArgs, 2 RestoreV2, 2 VariableV2, 2 RandomUniform, 2 Prod, 2 ApplyGradientDescent, 2 Placeholder, 1 Pack, 1 Neg, 1 MergeV2Checkpoints, 1 RealDiv, 1 MergeSumm