New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix rounding mode check in SSE4.1 round functions #3124
Conversation
src/shims/x86/sse41.rs
Outdated
// determined by the SSE status register. We do not support that case, | ||
// since modifying it is unsupported in Miri (and Rust). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the SSE status register different from the float status register?
For the float status register, we just assume it is always in NearestTiesToEven mode... but that's the rounding mode for float-to-float operations, right? Is the integer rounding mode even controlled in a status register for regular floats?
So for the SSE status register, I assume we'd just say that it is always assumed to have a particular rounding mode. But I don't know which rounding mode e.g. LLVM will assume here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is the same register that specifies rounding mode for arithmetic operations, so I guess we can assume to be round-to-nearest (we already do that for float-to-int conversions).
It looks like LLVM does not const-fold it, even when the rounding mode does not depend on the status register.
https://godbolt.org/z/zdh3McGcb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, don't the normal float-to-int casts do a round-to-zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSE has two version: one that rounds to zero (_mm_cvttss_si32
) and one that uses the rounding mode of the status register (_mm_cvtss_si32
). We assume round-to-nearest for the later.
Lines 168 to 176 in c8d4e83
let rnd = match unprefixed_name { | |
// "current SSE rounding mode", assume nearest | |
// https://www.felixcloutier.com/x86/cvtss2si | |
"cvtss2si" | "cvtss2si64" => rustc_apfloat::Round::NearestTiesToEven, | |
// always truncate | |
// https://www.felixcloutier.com/x86/cvttss2si | |
"cvttss2si" | "cvttss2si64" => rustc_apfloat::Round::TowardZero, | |
_ => unreachable!(), | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSE has two version: one that rounds to zero (_mm_cvttss_si32) and one that uses the rounding mode of the status register (_mm_cvtss_si32). We assume round-to-nearest for the later.
Ah okay, let's also do that here then.
But independently of that -- f32 as i32
casts also always round to zero, so I supposed they also ignore the status register? IOW, that status register really only affects SSE operations, not scalar float operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Rust's built-in cast will use the same instruction as _mm_cvttss_si32
.
075e582
to
3884d15
Compare
src/shims/x86/sse41.rs
Outdated
0b011 => rustc_apfloat::Round::TowardZero, | ||
// When the third bit is 1, the rounding mode is determined by the | ||
// SSE status register. Since we do not support modifying it from | ||
// Miri (or Rust), we assume to be at its default mode (round-to-nearest). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Miri (or Rust), we assume to be at its default mode (round-to-nearest). | |
// Miri (or Rust), we assume it to be at its default mode (round-to-nearest). |
r=me with the last typo fixed. |
✌️ @eduardosm, you can now approve this pull request! If @RalfJung told you to " |
Now it masks out the correct bit and adds some explanatory comments. Also extends the tests.
3884d15
to
2a88ae4
Compare
☀️ Test successful - checks-actions |
Now it masks out the correct bit and adds some explanatory comments. Also extends the tests.