Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
This page tracks the ongoing development of Glow. It documents the goals for upcoming development iterations, the status of some high-level tasks, and relevant information that can help people join the ongoing efforts.
Load additional quantized neural networks
Quantization is the process of converting neural networks that are programmed using 32-bit floating point operations to using 8-bit integer arithmetic. Glow can quantize existing floating point networks using Profile Guided Quantization and then run the quantized model for inference. Glow starts to support loading quantized Caffe2/ONNX models directly. The goal of this top-level task is to extend the loader support to additional quantized Caffe2 operators (https://github.com/pytorch/pytorch/tree/master/caffe2/quantization/server) and ONNX operators.
Asynchronous Model Execution
Glow is designed as a compiler and execution engine for neural network hardware accelerators. The current implementation of the execution engine is very basic and exposes a simple single-device synchronous run method. The goal for this top-level task is to rewrite the execution engine and implement an asynchronous execution mechanism that can be extended to support execution of code on multiple acceleration units concurrently. The execution engine will need to manage the state of multiple cards, queue requests and manage the complex state of buffers on the host and the device.
Glow integrates into PyTorch using the ONNXIFI interface. This interface offloads the compute graph from PyTorch onto Glow. This top-level task tracks the work to fully implement the ONNXIFI specification and to qualify the compiler using the ONNXIFI test suite.