Getting started: using the new features of MIGraphX 0.3

New Features in MIGraphX 0.3

MIGraphX 0.3 supports the following new features:

Tensorflow support
Quantization support, part 1
Horizontal fusion

This page provides examples of how to use these new features.

Quantization

The initial release of quantization support will quantize graph weights and values from float32 to float16. A new "quantize" API is added to the MIGraphX library. The function can be called from both Python and from C++ interfaces. Examples below illustrate C++ API calls.

#include <migraphx/quantization.hpp>
void quantize(program& prog, const std::vector<std::string>& ins_names);
void quantize(program& prog);

The quantization function should be called after the program is loaded and before it is compiled, e.g.

prog = parse_onnx(model_filename);
quantize(prog);
prog.compile(migraphx::gpu::target{});

When called with one argument, the quantize function will change all operations to float16. To quantize only particular operations, one provides a vector of operation names. For example, the following will quantize addition and subtraction:

quantize(prog,{"add","sub"});

Quantization from float32 to float16 canspeed up programs, both by using faster GPU operations and by reducing the amount of data that must be copied between layers.

Tensorflow Support

Freezing Tensorflow Graphs, an example

MIGraphX now can read models frozen for inference from the Tensorflow framework. Frozen tensorflow models are prepared with several steps to (a) remove training operators (b) freeze the graph including model weights into a *.pb file. These steps are illustrated below with an example from the Tensorflow Research Slim model library. We have verified MIGraphX using image models from this library.

The Slim model library includes a script named "export_inference_graph.py", that saves just the model definition to a file. Our first step is to call this script

prompt% python3 ${TENSORFLOW_MODELS}/research/slim/export_inference_graph.py --model_name=inception_v4 --output_file=${INCEPTIONDIR}/inception_v4_model.pb --batch_size=1

A few items to note about this particular invocation

The script has an option to save either a training-focused version of the model or an inference-focused version. By default, it uses an inference graph. Certain operators, e.g. dropout are removed from the graph. MIGraphX does not support such operators. Hence, if you are saving your own graph, you may need to save a graph specifically focused on inference.
We pass along a parameter to freeze a particular batch size with the graph. MIGraphX does not currently support variable graph sizes.

The next step is to freeze the graph itself. Freezing the graph will pull together the trained weights along with the saved model and save the combination as a frozen model. To freeze the graph, one needs to identify the "output nodes" for the last computation. We can find the names of output nodes with the following command

prompt% ${TENSORFLOWDIR}/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=${INCEPTIONDIR}/inception_v4_model.pb

As an input argument, we pass along the model file we saved in the previous command. This produces output that includes the following line

Found 2 possible outputs: (name=InceptionV4/AuxLogits/Aux_logits/BiasAdd, op=BiasAdd) (name=InceptionV4/Logits/Predictions, op=Softmax)

We take this information and pass it to the freeze_graph script as follows

prompt% ${TENSORFLOWDIR}/bazel-bin/tensorflow/python/tools/freeze_graph \
    --input_graph=${INCEPTIONDIR}/inception_v4_model.pb \
    --input_binary=true \
    --input_checkpoint=${INCEPTIONDIR}/inception_v4.ckpt \
    --output_node_names=InceptionV4/Logits/Predictions \
    --output_graph=${INCEPTIONDIR}/inception_v4_i1.pb

This script combines together the input checkpoint (which contains frozen weights), saved model and outputs a frozen tensorflow model "inception_v4_i1.pb" that we can use with MIGraphX

Using frozen Tensorflow models

MIGraphX provides the following API that will load a frozen Tensorflow model for use in MIGraphX

#include <migraphx/tf.hpp>
/// Create a program from a tf pb file (default is nhwc format)
program parse_tf(const std::string& name, bool is_nhwc);

This API is similar to the the parse_onnx() routine previously available in MIGraphX, except it enables both NHWC and NCHW formats. The API is also available as a Python interface with current limitation that the Python API supports either TF or ONNX but not both simultaneously. A cmake variable

cmake MIGRAPHX_ENABLE_TF=On

can be set at build time to enable this Python API for Tensorflow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting started: using the new features of MIGraphX 0.3

New Features in MIGraphX 0.3

Quantization

Tensorflow Support

Freezing Tensorflow Graphs, an example

Using frozen Tensorflow models

Clone this wiki locally

Navigation Menu

Getting started: using the new features of MIGraphX 0.3

New Features in MIGraphX 0.3

Quantization

Tensorflow Support

Freezing Tensorflow Graphs, an example

Using frozen Tensorflow models

Clone this wiki locally