Getting started: using the new features of MIGraphX 0.3

New Features in MIGraphX 0.3

MIGraphX 0.3 supports the following new features:

Tensorflow support
Quantization support, part 1
Horizontal fusion
Performance optimizations

This page provides examples of how to use these new features.

Quantization

The initial release of quantization support will quantize graph weights and values from float32 to float16. A new "quantize" API is added to the MIGraphX library. The function can be called from both Python and from C++ interfaces. Examples below illustrate C++ API calls.

#include <migraphx/quantization.hpp>
void quantize(program& prog, const std::vector<std::string>& ins_names);
void quantize(program& prog);

The quantization function should be called after the program is loaded and before it is compiled, e.g.

prog = parse_onnx(model_filename);
quantize(prog);
prog.compile(migraphx::gpu::target{});

When called with one argument, the quantize function will change all operations to float16. To quantize only particular operations, one provides a vector of operation names. For example, the following will quantize addition and subtraction:

quantize(prog,{"add","sub"});

Quantization from float32 to float16 canspeed up programs, both by using faster GPU operations and by reducing the amount of data that must be copied between layers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting started: using the new features of MIGraphX 0.3

New Features in MIGraphX 0.3

Quantization

Clone this wiki locally