New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid loss of precision from using reciprocal #55310
Avoid loss of precision from using reciprocal #55310
Conversation
Adding @penpornk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
@@ -88,7 +88,7 @@ struct FakeQuantWithMinMaxArgsFunctor { | |||
Nudge(min, max, quant_min, quant_max, &nudged_min, &nudged_max, | |||
&nudged_scale); | |||
|
|||
const float inv_nudged_scale = 1.0f / nudged_scale; | |||
const float inv_nudged_scale = (quant_max - quant_min) / (max - min); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Computing this here risks being out of sync with nudge_scale
if its computation is updated in the future. Please modify the Nudge
function to also return inv_nudge_scale
instead. Please also add a comment there that inv_nudge_scale
is computed separately to preserve precision.
0d7dd53
to
a80736f
Compare
@penpornk Updated as requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes!
PiperOrigin-RevId: 437683252
@penpornk What happened here? Why the rollback? |
@elfringham It broke some internal tests which use fixed golden values. I'll do a more thorough test before bringing this PR back. You don't need to do anything. |
Imported from GitHub PR #55310 Taking the reciprocal of the calculated value results in a loss of precision. This causes the unit test prepare-tf.mlir.test to fail on AARCH64. So instead of taking the reciprocal of the calculated nudged_scale to get the inv_nudged_scale, calculate this value from the input values. Copybara import of the project: -- a80736f by Andrew Goodbody <andrew.goodbody@linaro.org>: Avoid loss of precision from using reciprocal PiperOrigin-RevId: 440188576
Taking the reciprocal of the calculated value results in a loss of precision. This causes the unit test prepare-tf.mlir.test to fail on AARCH64. So instead of taking the reciprocal of the calculated nudged_scale to get the inv_nudged_scale, calculate this value from the input values.