New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

floating point to integer casts can cause undefined behaviour #10184

Open
thestinger opened this Issue Oct 31, 2013 · 137 comments

Comments

Projects
None yet
@thestinger
Contributor

thestinger commented Oct 31, 2013

Current status (2018-11-05)

A flag has been implemented in the compiler, -Zsaturating-float-casts, which will cause all float to integer casts have "saturating" behavior where if it's out of bounds it's clamped to the nearest bound. A call for benchmarking of this change went out awhile ago. Results, while positive in many projects, are quite negative for some projects and indicates that we're not done here.

The next steps are figuring out how to recover performance for these cases:

  • One option is to take today's as cast behavior (which is UB in some cases) and add unsafe functions for the relevant types and such.
  • Another is to wait for LLVM to add a freeze concept which means that we get a garbage bit pattern, but it's at least not UB
  • Another is to implement casts via inline assembly in LLVM IR, as the current codegen is not heavily optimized.

Old status

UPDATE (by @nikomatsakis): After much discussion, we've got the rudiments of a plan for how to address this problem. But we need some help with actually investigating the performance impact and working out the final details!


ORIGINAL ISSUE FOLLOWS:

If the value cannot fit in ty2, the results are undefined.

1.04E+17 as u8
@brson

This comment has been minimized.

Contributor

brson commented Oct 31, 2013

Nominating

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Nov 7, 2013

accepted for P-high, same reasoning as #10183

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Nov 22, 2013

I don't think this is backwards incompatible at a language level. It will not cause code that was working OK to stop working. Nominating.

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Dec 19, 2013

changing to P-high, same reasoning as #10183

@nrc

This comment has been minimized.

Member

nrc commented Sep 12, 2014

How do we propose to solve this and #10185? Since whether behaviour is defined or not depends on the dynamic value of the number being cast, it seems the only solution is to insert dynamic checks. We seem to agree we do not want to do that for arithmetic overflow, are we happy to do it for cast overflow?

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Sep 12, 2014

We could add an intrinsic to LLVM that performs a "safe conversion". @zwarich may have other ideas.

@zwarich

This comment has been minimized.

zwarich commented Sep 12, 2014

AFAIK the only solution at the moment is to use the target-specific intrinsics. That's what JavaScriptCore does, at least according to someone I asked.

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Sep 12, 2014

Oh, that's easy enough then.

@nrc

This comment has been minimized.

Member

nrc commented Apr 23, 2015

ping @pnkfelix is this covered by the new overflow checking stuff?

@bluss

This comment has been minimized.

Contributor

bluss commented May 7, 2015

These casts are not checked by rustc with debug assertions.

@Aatch

This comment has been minimized.

Contributor

Aatch commented Sep 13, 2015

I'm happy to handle this, but I need a concrete solution. I personally think that it should be checked along with overflowing integer arithmetic, as it's a very similar issue. I don't really mind what we do though.

Note that this issue is currently causing an ICE when used in certain constant expressions.

@bluss

This comment has been minimized.

Contributor

bluss commented Sep 13, 2015

This allows violating memory safety in safe rust, example from this forum post:

Undefs, huh? Undefs are fun. They tend to propagate. After a few minutes of wrangling..

#[inline(never)]
pub fn f(ary: &[u8; 5]) -> &[u8] {
    let idx = 1e100f64 as usize;
    &ary[idx..]
}

fn main() {
    println!("{}", f(&[1; 5])[0xdeadbeef]);
}

segfaults on my system (latest nightly) with -O.

@steveklabnik

This comment has been minimized.

Member

steveklabnik commented Oct 8, 2015

Marking with I-unsound given the violation of memory safety in safe rust.

@steveklabnik

This comment has been minimized.

Member

steveklabnik commented Oct 29, 2015

@bluss , this does not segfualt for me, just gives an assertion error. untagging since i was the one who added it

@steveklabnik

This comment has been minimized.

Member

steveklabnik commented Oct 29, 2015

Sigh, I forgot the -O, re-tagging.

@nagisa

This comment has been minimized.

Contributor

nagisa commented Feb 21, 2016

re-nominating for P-high. Apparently this was at some point P-high but got lower over time. This seems pretty important for correctness.

EDIT: didn’t react to triage comment, adding label manually.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Feb 25, 2016

It seems like the precedent from the overflow stuff (e.g. for shifting) is to just settle on some behavior. Java seems to produce the result modulo the range, which seems not unreasonable; I'm not sure just what kind of LLVM code we'd need to handle that.

vks added a commit to vks/rand that referenced this issue Apr 23, 2018

Use generated floats in Bernoulli::sample
This is slower, but uses 52 random bits instead of 32 bits. Using the
previous approach with 64 bits is not feasible due to a [Rust bug].

[Rust bug]: rust-lang/rust#10184

vks added a commit to vks/rand that referenced this issue May 11, 2018

Use generated floats in Bernoulli::sample
This is slower, but uses 52 random bits instead of 32 bits. Using the
previous approach with 64 bits is not feasible due to a [Rust bug].

[Rust bug]: rust-lang/rust#10184
@kennytm

This comment has been minimized.

Member

kennytm commented May 19, 2018

Minor update, as of rustc 1.28.0-nightly (952f344cd 2018-05-18), the -Zsaturating-float-casts flag still causes the code in #10184 (comment) to be ~20% slower on x86_64. Which means LLVM 6 hasn't changed anything.

Flags Timing
-Copt-level=3 -Ctarget-cpu=native 325,699 ns/iter (+/- 7,607)
-Copt-level=3 -Ctarget-cpu=native -Zsaturating-float-casts 386,962 ns/iter (+/- 11,601)
(19% slower)
-Copt-level=3 331,521 ns/iter (+/- 14,096)
-Copt-level=3 -Zsaturating-float-casts 413,572 ns/iter (+/- 19,183)
(25% slower)
@bstrie

This comment has been minimized.

Contributor

bstrie commented Jun 21, 2018

@kennytm Did we expect LLVM 6 to change something? Are they discussing a particular enhancement that would benefit this use case? If so, what's the ticket number?

@shepmaster

This comment was marked as outdated.

Member

shepmaster commented Jul 20, 2018

@insanitybit It... appears to still be open...?

image

@insanitybit

This comment was marked as resolved.

insanitybit commented Jul 20, 2018

Welp, no clue what I was looking at. Thanks!

@nagisa

This comment has been minimized.

Contributor

nagisa commented Jul 20, 2018

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Jul 20, 2018

@nagisa Maybe you’re thinking of f32::from_bits(v: u32) -> f32 (and similarly f64)? It used to do some normalization of NaNs but now is just transmute.

This issue is about as conversions which try to approximate the numerical value.

@rkruppe

This comment has been minimized.

Contributor

rkruppe commented Jul 20, 2018

@nagisa You might be thinking of float->float casts, see #15536 and rust-lang-nursery/nomicon#65.

@nagisa

This comment has been minimized.

Contributor

nagisa commented Jul 20, 2018

@bstrie

This comment has been minimized.

Contributor

bstrie commented Sep 19, 2018

LLVM 7 release notes mention something:

Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on the undefined behavior of overflowing casts. The optimization can be disabled by specifying a function attribute: "strict-float-cast-overflow"="false". This attribute may be created by the clang option -fno-strict-float-cast-overflow. Code sanitizers can be used to detect affected patterns. The clang option for detecting this problem alone is -fsanitize=float-cast-overflow:

Does that have any bearing on this issue?

@retep998

This comment has been minimized.

Member

retep998 commented Sep 19, 2018

We shouldn't care what LLVM does for overflowing casts, as long as it isn't unsafe undefined behavior. The result can be garbage as long as it can't cause unsound behavior.

@rkruppe

This comment has been minimized.

Contributor

rkruppe commented Sep 19, 2018

Does that have any bearing on this issue?

Not really. The UB did not change, LLVM got even more aggressive about exploiting it, which makes it easier to be affected by it in practice, but the soundness issue is unchanged. In particular, the new attribute does not remove the UB or affect any optimizations that existed before LLVM 7.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Oct 14, 2018

@rkruppe out of curiosity, has this sort of fallen by the wayside? It seems like https://internals.rust-lang.org/t/help-us-benchmark-saturating-float-casts/6231/14 went well enough and the implementation hasn't had too many bugs. It seems that a slight performance regression was always expected, but to compile correctly seems like a worthwhile tradeoff.

Is this just waiting to be pushed across the finish line? Or are there other known blockers?

@rkruppe

This comment has been minimized.

Contributor

rkruppe commented Oct 14, 2018

Mostly I've been distracted / busy with other things, but a x0.82 regression in RBG JPEG encoding seems more than "slight", a rather bitter pill to swallow (although it's reassuring that other kinds of workload don't seem affected). It's not severe enough that I would object to turning saturation on by default, but enough that I'm hesitant to push for it myself before we've tried the "also provide a conversion function that's faster than saturation but may generate (safe) garbage" option discussed before. I haven't gotten to that, and apparently nobody else has either, so this has fallen by the wayside.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Oct 17, 2018

Ok cool thanks for the update @rkruppe! I'm curious though if there's actually an implementation of the safe garbage option? I could imagine us easily providing something like unsafe fn i32::unchecked_from_f32(...) and such, but it sounds like you're thinking that should be a safe function. Is that possible with LLVM today?

@rkruppe

This comment has been minimized.

Contributor

rkruppe commented Oct 17, 2018

There's no freeze yet but it is possible to use inline assembly to access the target architecture's instruction for converting floats to integers (with a fallback to e.g. saturating as). While this can inhibit some optimizations, it may be good enough to mostly fix the regression in some benchmarks.

An unsafe function that keeps the UB that this issue is about (and is codegen'd in the same way as as is today) is another option, but a much less attractive one, I'd prefer a safe function if it can get the job done.

@sunfishcode

This comment has been minimized.

Contributor

sunfishcode commented Oct 17, 2018

There's also significant room for improvement in the safe saturating float-to-int sequence. LLVM today doesn't have anything specifically for this, but if inline asm solutions are on the table, it wouldn't be difficult to do something like this:

     cvttsd2si %xmm0, %eax   # x86's cvttsd2si returns 0x80000000 on overflow and invalid cases
     cmp $1, %eax            # a compact way to test whether %eax is equal to 0x80000000
     jno ok
     ...  # slow path: check for and handle overflow and invalid cases
ok:

which should be significantly faster than what rustc currently does.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Oct 17, 2018

Ok I just wanted to be sure to clarify, thanks! I figured that inline asm solutions aren't workable as defaults as it'd inhibit other optimizations too much, but I haven't tried out myself. I'd personally prefer that we close this unsound hole by defining some reasonable behavior (like exactly today's saturating casts). If necessary we can always preserve today's fast/unsound implementation as an unsafe function, and in the limit of time given infinite resources we can even drastically improve the default and/or add other specialized conversion functions (like a safe conversion where out of bounds isn't UB but just a garbage bit pattern)

Would others be opposed to such a strategy? Do we think that this isn't important enough to fix in the meantime?

@rkruppe

This comment has been minimized.

Contributor

rkruppe commented Oct 17, 2018

I think inline assembly should be tolerable for cvttsd2si (or similar instructions) specifically because that inline asm would not access memory or have side effects, so it's just an opaque black box that can be removed if unused and doesn't inhibit optimizations around it very much, LLVM just can't reason about the internals and result value of the inline asm. That last bit is why I would be skeptical about e.g. using inline asm for the code sequence @sunfishcode suggests for saturation: the checks introduced for saturation can occasionally be removed today if they're redundant, but the branches in an inline asm block can't be simplified.

Would others be opposed to such a strategy? Do we think that this isn't important enough to fix in the meantime?

I do not object to flipping on saturating now and possibly adding alternatives later, I just don't want to be the one who has to drum up the consensus for it and justify it to users whose code got slower 😅

@nikic

This comment has been minimized.

Contributor

nikic commented Nov 28, 2018

I've started some work to implement intrinsics for saturating float to int casts in LLVM: https://reviews.llvm.org/D54749

If that goes anywhere, it will provide a relatively low-overhead way of getting the saturating semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment