Getting started: using the new features of MIGraphX 0.3
MIGraphX 0.3 supports the following new features:
- Tensorflow support
- Quantization support, part 1
- Horizontal fusion
- Performance optimizations
This page provides examples of how to use these new features.
The initial release of quantization support will quantize graph weights and values from float32 to float16. A new "quantize" API is added to the MIGraphX library. The function can be called from both Python and from C++ interfaces. Examples below illustrate C++ API calls.
#include <migraphx/quantization.hpp>
void quantize(program& prog, const std::vector<std::string>& ins_names);
void quantize(program& prog);
The quantization function should be called after the program is loaded and before it is compiled, e.g.
prog = parse_onnx(model_filename);
quantize(prog);
prog.compile(migraphx::gpu::target{});
When called with one argument, the quantize function will change all operations to float16. To quantize only particular operations, one provides a vector of operation names. For example, the following will quantize addition and subtraction:
quantize(prog,{"add","sub"});
Quantization from float32 to float16 canspeed up programs, both by using faster GPU operations and by reducing the amount of data that must be copied between layers.