Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ONNXModelLoader] Enabling operator instance based mixed precision su…
…pport
- Loading branch information
1 parent
d015e12
commit b5a17e5
Showing
20 changed files
with
1,105 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
## GLOW Model loader precision configuration | ||
|
||
This document describes the mixed precision feature which enables the network | ||
to be able to run with the combination of operators in float, float16_t and int8 | ||
precision. | ||
|
||
### Overview | ||
|
||
Glow has following two options along with quantization to set precision of operators | ||
|
||
`convert-to-fp16` - Allows running all floating point operations in fp16. | ||
`keep-original-precision-for-nodes` - Allows running certain node kinds not to be | ||
quantized and run in the orginal precision. | ||
|
||
Note that the above two options are node kind based. In order to run specific instances | ||
of operators in fp16 precision `-node-precision-info` option can be used to indicate | ||
execution of specific nodes in fp16. The nodes to run in fp16 are specified by the name of | ||
the first output in a yaml file. | ||
|
||
`-node-precision-info` option can be passed along with `-load-profile` option of building | ||
quantized models. In such case operators not mentioned in `-node-precision-info` will run | ||
in quantized precision (If supported by backend) | ||
|
||
### Design details | ||
|
||
#### `-node-precision-info` yaml schema | ||
|
||
Precision profile can be created with a list of output names of nodes required to run | ||
in fp16 as shown below. | ||
|
||
``` | ||
FP16NodeInstanceNames: [109, 110, 111, 112, 237] | ||
``` | ||
|
||
#### How to use mixed precision feature | ||
Generate quantization profile using the following command | ||
``` | ||
./bin/image-classifier tests/images/imagenet/*.png -image-mode=0to1 -m=resnet50 -model-input-name=gpu_0/data -dump-profile="profile.yaml" -node-precision-info="precision_profile.yaml" | ||
``` | ||
|
||
Use the generated quantization profile from above command along with `-node-precision-info` to run the network in mixed precision | ||
``` | ||
./bin/image-classifier tests/images/imagenet/*.png -image-mode=0to1 -m=resnet50 -model-input-name=gpu_0/data -load-profile="profile.yaml" -node-precision-info="precision_profile.yaml" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#ifndef GLOW_IMPORTER_MODELLOADERPRECISIONCONFIGURATION_H | ||
#define GLOW_IMPORTER_MODELLOADERPRECISIONCONFIGURATION_H | ||
|
||
#include "glow/Support/Error.h" | ||
|
||
#include "llvm/ADT/APInt.h" | ||
|
||
#include <vector> | ||
|
||
namespace glow { | ||
/// Holds info about mixed precision details which can be used across model | ||
/// loaders | ||
struct ModelLoaderPrecisionConfiguration { | ||
/// Used during operator loading while constructing glow graph to keep the | ||
/// precision of specified operator names to FP16 (i.e. quantization | ||
/// conversion is skipped and FP16 conversion is done for any node kinds | ||
/// found here). This creates a graph where some nodes execute in quantized | ||
/// or FP32 precision and remaining in FP16 precision. If the node kind | ||
/// specified via its name is unsupported by the backend in FP16 precision | ||
/// it will throw an exception. Node instances indended to run in FP16 will | ||
/// be in yaml file as list which can be mapped directly to a vector of | ||
/// string, therefore parsing will be faster. | ||
std::vector<std::string> fp16OpInstanceNames; | ||
}; | ||
|
||
/// Sets model loader precision profile option with \p YAML fileName | ||
void setModelLoaderPrecisionOpt(llvm::StringRef fileName); | ||
|
||
/// Check if node precision info file is provided | ||
bool modelLoaderPrecisionOptEnabled(); | ||
|
||
/// Deserialize Model loader precision info from the \p YAML file | ||
Expected<ModelLoaderPrecisionConfiguration> | ||
deserializeModelLoaderPrecisionInfosFromYaml(); | ||
} // namespace glow | ||
|
||
#endif // GLOW_IMPORTER_MODELLOADERPRECISIONCONFIGURATION_H |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
#include "glow/Importer/ModelLoaderPrecisionConfiguration.h" | ||
#include "glow/Support/Support.h" | ||
|
||
#include "llvm/Support/CommandLine.h" | ||
#include "llvm/Support/FileSystem.h" | ||
#include "llvm/Support/YAMLParser.h" | ||
#include "llvm/Support/YAMLTraits.h" | ||
#include "llvm/Support/raw_ostream.h" | ||
|
||
namespace llvm { | ||
namespace yaml { | ||
|
||
/// Mapping for ModelLoaderPrecisionConfiguration yaml serializer. | ||
template <> struct MappingTraits<glow::ModelLoaderPrecisionConfiguration> { | ||
static void mapping(IO &io, glow::ModelLoaderPrecisionConfiguration &info) { | ||
io.mapRequired("FP16NodeInstanceNames", info.fp16OpInstanceNames); | ||
} | ||
}; | ||
|
||
} // end namespace yaml | ||
} // end namespace llvm | ||
|
||
namespace glow { | ||
|
||
llvm::cl::OptionCategory loaderPrecisionCat("ModelLoader Precision Options"); | ||
|
||
llvm::cl::opt<std::string> loadModelLoaderPrecisionFileOpt( | ||
"node-precision-info", | ||
llvm::cl::desc("Load model loader precision file which contains\n" | ||
"instances output names to be executed in FP16\n" | ||
"Currently supported only for ONNX models"), | ||
llvm::cl::value_desc("precision_info.yaml"), | ||
llvm::cl::cat(loaderPrecisionCat)); | ||
|
||
void setModelLoaderPrecisionOpt(llvm::StringRef fileName) { | ||
loadModelLoaderPrecisionFileOpt = fileName; | ||
} | ||
|
||
bool modelLoaderPrecisionOptEnabled() { | ||
if (loadModelLoaderPrecisionFileOpt.empty()) { | ||
return false; | ||
} else { | ||
return true; | ||
} | ||
} | ||
|
||
Expected<ModelLoaderPrecisionConfiguration> | ||
deserializeModelLoaderPrecisionInfosFromYaml() { | ||
ModelLoaderPrecisionConfiguration modelLoaderPrecsionConfig; | ||
|
||
llvm::StringRef fileName = loadModelLoaderPrecisionFileOpt; | ||
|
||
RETURN_ERR_IF_NOT(llvm::sys::fs::exists(fileName), | ||
"Could not find file with name: " + fileName.str()); | ||
|
||
// Open YAML input stream. | ||
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> text = | ||
llvm::MemoryBuffer::getFileAsStream(fileName); | ||
|
||
RETURN_ERR_IF_NOT(!text.getError(), | ||
"Unable to open file with name: " + fileName.str()); | ||
|
||
std::unique_ptr<llvm::MemoryBuffer> buffer = std::move(*text); | ||
llvm::yaml::Input yin(buffer->getBuffer()); | ||
|
||
// Error message in case of incorrect precision info format. | ||
std::string ErrMsg = | ||
strFormat("Error reading YAML file '%s'!", fileName.data()); | ||
|
||
// Read profiling info. | ||
yin >> modelLoaderPrecsionConfig; | ||
RETURN_ERR_IF_NOT(!yin.error(), ErrMsg); | ||
return modelLoaderPrecsionConfig; | ||
} | ||
|
||
} // namespace glow |
Oops, something went wrong.