Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[Quantization][Refactor] Make profile include only min/max #4326
Until this point the profile generated after the profiling phase included the quantization parameters scale and offset which were computed for a given quantization configuration (schema, precision, precisionBias). This approach was extremely error prone (multiple times I had this problem) because in order for the flow to be consistent, same configuration had to be provided during the profiling (e.g. while using image-classifier or model-profiler) versus during the quantization (e.g. while using the model-compiler).
With this PR the profile includes only the profiling specific information (min and max) while the actual quantization parameters are computed during the quantization phase. This has the following benefits:
The functionality of serializing the tensor quantization parameters (scale & offset) is still there and we might use it for debugging (although the scales and offset are readable in the dumped graph).