Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imprecise floating point operations (fast-math) #21690

Open
noctune opened this Issue Jan 27, 2015 · 13 comments

Comments

Projects
None yet
9 participants
@noctune
Copy link
Contributor

noctune commented Jan 27, 2015

There should be a way to use imprecise floating point operations like GCC's and Clang's -ffast-math. The simplest way to do this would be to do like GCC and Clang and implement a command line flag, but I think a better way to do this would be to create a f32fast and f64fast type that would then call the fast LLVM math functions. This way you can easily mix fast and "slow" floating point operations.

I think this could be implemented as a library if LLVM assembly could be used in the asm macro.

@kmcallister kmcallister added the A-LLVM label Jan 28, 2015

@kmcallister

This comment has been minimized.

Copy link
Contributor

kmcallister commented Jan 28, 2015

Inline IR was discussed on #15180. Another option is extern "llvm-intrinsic" { ... } which I vaguely think we had at some point. If we added more intrinsics to std::intrinsics would that be sufficient?

@huonw huonw added the I-slow label Jan 28, 2015

@noctune

This comment has been minimized.

Copy link
Contributor Author

noctune commented Jan 28, 2015

Yeah, adding it as a function in std::intrinsics could definitely work as well.

There are a few different fast math flags, but the fast flag is probably the most important as it implies all the other flags. Adding all of them would be unreasonable if using intrinsic functions, but I don't think all of them are necessary.

@cmr cmr self-assigned this Mar 25, 2015

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Aug 17, 2015

This forum thread has examples of loops that llvm can vectorize well for integers, but doesn't for floats (a dot product).

@bluss bluss changed the title Imprecise floating point operations Imprecise floating point operations (fast-math) Dec 20, 2015

@cmr cmr removed their assignment Jan 5, 2016

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Jun 8, 2017

That would be super useful for me.

I've prototyped it using a newtype: https://play.rust-lang.org/?gist=d516771d1d002f740cc9bf6eb5cacdf0&version=nightly&backtrace=0

It works in simple cases, but the newtype solution is insufficient:

  • it doesn't work with floating-point literals. That's a huge pain when converting programs to this newtype.
  • it doesn't work with the as operator, and a trait to make that possible has been rejected before.
  • the wrapper type and extra level of indirection affects inlining of code using it. I've found some large functions where the newtype was slower than regular float, but not because of float math, but because other structs and calls around it weren't as optimized. I wasn't able to reproduce it in simple cases, so I'm not sure what exactly is going on.

So I'm very keen on seeing it supported natively in Rust.

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Jun 8, 2017

@pornel The issue #24963 had a test case where a newtype impacted vectorization. That example was fixed (great!), sounds like the bug is probably still visible in similar code.

@pedrocr

This comment has been minimized.

Copy link

pedrocr commented Jun 8, 2017

I've tried -ffast-math in my C vs Rust benchmark of some graphics code:

https://github.com/pedrocr/rustc-math-bench

In the C code it's a ~20% improvement in clang but no benefit with GCC. In both cases it returns a wrong result and the math is extremely simple (multiplying a vector by a matrix). According to this:

https://stackoverflow.com/questions/38978951/can-ffast-math-be-safely-used-on-a-typical-project#38981307

-ffast-math is generally too unsafe for normal usage as it implies some strange things (e.g., NaN checks always return false). So it seems sensible to have a way to opt-in only to the more benign ones.

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Jun 8, 2017

@pedrocr Your benchmark has a loss of precision in sum regardless of fast-math mode. Both slow and fast give wrong result compared to summation using double sum.

With double for the sum and you'll get correct result, even with -ffast-math.

You get significantly different sum with float sum, because fast-math gives you a small systemic rounding error, which accumulates over 100 million additions.

All values from matrix multiplication are the same to at least 6 digits (I've diffed printf("%f", out[i]) of all values and they're all the same).

@pedrocr

This comment has been minimized.

Copy link

pedrocr commented Jun 8, 2017

@pornel thanks, fixed here:

pedrocr/rustc-math-bench@8169fa3

The benchmark results are fine though, the sum is only used as a checksum. Here are the averages of three runs in ms/megapixel:

Compiler -O3 -march=native -O3 -march=native -ffast-math
clang 3.8.0-2ubuntu4 6,91 5,40 (-22%)
gcc 5.4.0-6ubuntu1~16.04.4 5,71 5,85 (+2%)

So as I mentioned before clang/llvm gets a good benefit from ffast-math but not gcc. I'd say making sure things like is_normal() still work is very important but at least on llvm it helps to be able to enable ffast-math.

@pedrocr

This comment has been minimized.

Copy link

pedrocr commented Jun 8, 2017

I've suggested it would make sense to expose -ffast-math using the target-feature mechanisms:

https://internals.rust-lang.org/t/pre-rfc-stabilization-of-target-feature/5176/23

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Jun 8, 2017

Rust has fast math intrinsics, so the fast math behavior could be limited to a specific type or selected functions, without forcing the whole program into it.

@pedrocr

This comment has been minimized.

Copy link

pedrocr commented Jun 9, 2017

A usable solution for my use cases would probably be to have the vector types in the simd crate be the types that allow the opt-in to ffast-math. That way there's only one type I need to conciously convert the code to for speedups. But for the general solution of in normal code having to swap types seems cumbersome. But maybe just doing return val as f32 when val is an f32fast type isn't that bad.

@pedrocr

This comment has been minimized.

Copy link

pedrocr commented Aug 10, 2017

Created a pre-RFC discussion on internals to try and get a discussion on the best way to do this:

https://internals.rust-lang.org/t/pre-rfc-whats-the-best-way-to-implement-ffast-math/5740

@robsmith11

This comment has been minimized.

Copy link

robsmith11 commented Jan 20, 2019

Is there a current recommended approach to using fast-math optimizations in rust nightly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.