New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More numerically stable lerp #18871
More numerically stable lerp #18871
Conversation
method proposed by https://math.stackexchange.com/a/1798323 could be better |
@ssnl I rewrote based on the proposal you shared. However, I'm worried that this may not perform well. For example, if t is fixed, then there's no warp divergence. If t depends on the data the way it does in say bilinear upsampling, I wonder if that may result in warp divergence and therefore be bad for perf. Of course the counter-argument to that is that this would be a stall for only a few instructions, for a problem that's bandwidth-bound anyway. |
@mkolod Maybe do a quick benchmark, if you're concerned? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ezyang It's fine. Looks like one CI run out of 76 is having issues, but it seems nothing to do with the commit (it's a multiprocessing/gloo issue). I hope that's not a blocker for the merging since it's not the lerp test that's affected, and I didn't change anything else in the code. |
Summary: The C++ and CUDA implementations of the lerp are not numerically stable. This is discussed on Wikipedia [here](https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support). I checked the GPU SASS output and there's no overhead from using the more precise implementation, from Kepler all the way to Turing. I haven't looked at CPU ASM though. Pull Request resolved: pytorch/pytorch#18871 Differential Revision: D14793438 Pulled By: ezyang fbshipit-source-id: 2ddc2e026c5285466cae7d1b4101174253100445
Summary: The C++ and CUDA implementations of the lerp are not numerically stable. This is discussed on Wikipedia [here](https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support). I checked the GPU SASS output and there's no overhead from using the more precise implementation, from Kepler all the way to Turing. I haven't looked at CPU ASM though. Pull Request resolved: pytorch#18871 Differential Revision: D14793438 Pulled By: ezyang fbshipit-source-id: 2ddc2e026c5285466cae7d1b4101174253100445
The C++ and CUDA implementations of the lerp are not numerically stable. This is discussed on Wikipedia here. I checked the GPU SASS output and there's no overhead from using the more precise implementation, from Kepler all the way to Turing. I haven't looked at CPU ASM though.