# Optimize Trained Models for Inference

## [Graph Transform Tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms)
Great [Blog Post](https://petewarden.com/2016/12/30/rewriting-tensorflow-graphs-with-the-gtt/) by [Pete Warden](https://www.linkedin.com/in/petewarden) from Google

## Types of Optimizations
* Quantize nodes (activations)

## Quantize Activations (ie. Quantize Nodes)
Prereq: quantize weights

In [None]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/linear/cpu/unoptimized_model_cpu.pb \
--out_graph=/root/models/optimize_me/linear/cpu/quantize_nodes_optimized_cpu.pb \
--inputs='x_observed' \
--outputs='add' \
--transforms='
strip_unused_nodes
remove_nodes(op=Identity, op=CheckNumerics)
fold_constants(ignore_errors=true)
fold_batch_norms
fold_old_batch_norms
quantize_weights
quantize_nodes'

In [None]:
%%bash

ls -l /root/models/optimize_me/linear/cpu/

In [None]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/linear/cpu/quantize_nodes_optimized_cpu.pb

In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import re
from google.protobuf import text_format
from tensorflow.core.framework import graph_pb2

def convert_graph_to_dot(input_graph, output_dot, is_input_graph_binary):
    graph = graph_pb2.GraphDef()
    with open(input_graph, "rb") as fh:
        if is_input_graph_binary:
            graph.ParseFromString(fh.read())
        else:
            text_format.Merge(fh.read(), graph)
    with open(output_dot, "wt") as fh:
        print("digraph graphname {", file=fh)
        for node in graph.node:
            output_name = node.name
            print("  \"" + output_name + "\" [label=\"" + node.op + "\"];", file=fh)
            for input_full_name in node.input:
                parts = input_full_name.split(":")
                input_name = re.sub(r"^\^", "", parts[0])
                print("  \"" + input_name + "\" -> \"" + output_name + "\";", file=fh)
        print("}", file=fh)
        print("Created dot file '%s' for graph '%s'." % (output_dot, input_graph))
        

In [None]:
input_graph='/root/models/optimize_me/linear/cpu/quantize_nodes_optimized_cpu.pb'
output_dot='/root/notebooks/quantize_nodes_optimized_cpu.dot'
convert_graph_to_dot(input_graph=input_graph, output_dot=output_dot, is_input_graph_binary=True)

In [None]:
%%bash

dot -T png /root/notebooks/quantize_nodes_optimized_cpu.dot \
    -o /root/notebooks/quantize_nodes_optimized_cpu.png > /tmp/a.out

In [None]:
from IPython.display import Image

Image('/root/notebooks/quantize_nodes_optimized_cpu.png')

## Requires Calibration and [`freeze_requantization_ranges`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md#freeze_requantization_ranges)