## Types of Optimizations Applied for Inference
* Remove training-only operations (checkpoint saving, drop out)
* Strip out unreachable nodes
* Remove debug operations (CheckNumerics)
* Fold batch normalization Ops into the pre-calculated weights (super cool)
* Fuse adjacent operators

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/python/tools

## Graph Transform Tool

https://petewarden.com/2016/12/30/rewriting-tensorflow-graphs-with-the-gtt/

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms

## Summarize Model

In [32]:
!which summarize_graph

/root/serving/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph


In [33]:
!ls -l /root/models/optimize_me/

total 136
-rw-r--r-- 1 root root  3010 Mar 30 01:12 optimized_for_inference.pb
-rw-r--r-- 1 root root  3010 Mar 30 01:22 optimized_for_inference_cpu.pb
-rw-r--r-- 1 root root   898 Mar 30 01:27 strip_unused_gtt.pb
-rw-r--r-- 1 root root  1043 Mar 30 01:12 stripped_unused.pb
-rw-r--r-- 1 root root  1043 Mar 30 01:22 stripped_unused_cpu.pb
-rw-r--r-- 1 root root 43202 Mar 30 01:16 unoptimized.pb
-rw-r--r-- 1 root root 34046 Mar 30 01:22 unoptimized_cpu.pb
-rw-r--r-- 1 root root 34046 Mar 30 01:18 unoptimized_gpu.pb


In [34]:
!summarize_graph --in_graph=/root/models/optimize_me/unoptimized_cpu.pb

Found 2 possible inputs: (name=x_observed, type=float(1), shape=[]) (name=y_observed, type=float(1), shape=[]) 
No variables spotted.
Found 4 possible outputs: (name=gradients/sub_grad/tuple/control_dependency_1, op=Identity) (name=gradients/mul_grad/tuple/control_dependency_1, op=Identity) (name=Merge/MergeSummary, op=MergeSummary) (name=save/Identity, op=Identity) 
Found 25 (25) const parameters, 0 (0) variable parameters, and 24 control_edges
92 nodes assigned to device '/device:CPU:0'Op types used: 29 Const, 10 Identity, 7 NoOp, 7 Mul, 7 Reshape, 6 Sum, 6 Shape, 4 Assign, 3 Add, 3 Sub, 3 BroadcastGradientArgs, 2 RestoreV2, 2 VariableV2, 2 RandomUniform, 2 Prod, 2 ApplyGradientDescent, 2 Placeholder, 1 Pack, 1 Neg, 1 MergeV2Checkpoints, 1 RealDiv, 1 MergeSummary, 1 Mean, 1 SaveV2, 1 ScalarSummary, 1 Maximum, 1 ShardedFilename, 1 Square, 1 StringJoin, 1 FloorDiv, 1 Fill, 1 Tile, 1 Cast
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorf

In [35]:
!transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_cpu.pb \
--out_graph=/root/models/optimize_me/optimized_cpu.pb \
--inputs='x_observed,y_observed' \
--outputs='Merge/MergeSummary' \
--transforms='\
strip_unused_nodes(type=float, shape="1,299,299,3") \
remove_nodes(op=Identity, op=CheckNumerics) \
fold_constants(ignore_errors=true) \
fold_batch_norms \
fold_old_batch_norms\
'

2017-03-30 01:29:32.630149: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying strip_unused_nodes
2017-03-30 01:29:32.630583: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying remove_nodes
2017-03-30 01:29:32.631498: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_constants
2017-03-30 01:29:32.646387: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_batch_norms
2017-03-30 01:29:32.646735: I tensorflow/tools/graph_transforms/transform_graph.cc:257] Applying fold_old_batch_norms


In [36]:
!summarize_graph --in_graph=/root/models/optimize_me/optimized_cpu.pb

Found 2 possible inputs: (name=x_observed, type=float(1), shape=[]) (name=y_observed, type=float(1), shape=[]) 
No variables spotted.
Found 1 possible outputs: (name=Merge/MergeSummary, op=MergeSummary) 
Found 2 (2) const parameters, 0 (0) variable parameters, and 0 control_edges
10 nodes assigned to device '/device:CPU:0'Op types used: 2 Const, 2 Placeholder, 2 VariableV2, 1 Add, 1 Mean, 1 MergeSummary, 1 Mul, 1 ScalarSummary, 1 Square, 1 Sub
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/root/models/optimize_me/optimized_cpu.pb --show_flops --logtostderr --input_layer=x_observed,y_observed --input_layer_type=float,float --input_layer_shape=: --output_layer=Merge/MergeSummary


In [None]:
!transform_graph \
--in_graph=/root/models/optimize_me/optimized_cpu.pb \
--out_graph=quantized_cpu.pb \
--inputs='x_observed,y_observed' \
--outputs='Merge/MergeSummary' \
--transforms='add_default_attributes 
strip_unused_nodes(type=float, shape="1,299,299,3") 
remove_nodes(op=Identity, op=CheckNumerics) 
fold_old_batch_norms 
quantize_weights 
quantize_nodes 
strip_unused_nodes 
sort_by_execution_order'