Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upDocument and update i686 triples #31632
Conversation
ranma42
added some commits
Feb 13, 2016
rust-highfive
assigned
alexcrichton
Feb 13, 2016
This comment has been minimized.
This comment has been minimized.
|
(rust_highfive has picked a reviewer for you, use r? to override) |
ranma42
referenced this pull request
Feb 13, 2016
Closed
Do not use SIMD instructions on i686 #31110
This comment has been minimized.
This comment has been minimized.
ranma42
commented on src/librustc_back/target/i686_apple_darwin.rs in 8c840ee
Feb 13, 2016
|
I wonder if this is the "right" change, or if this actually was meant to be the |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
MagaTailor
commented
Feb 13, 2016
|
@dhuseby Either way, to avoid any potential user disappointment, the current BSD stage0 snapshot was probably built with the old codegen settings and won't run on the newly supported "i686" systems. |
Manishearth
added a commit
to Manishearth/rust
that referenced
this pull request
Feb 14, 2016
bors
added a commit
that referenced
this pull request
Feb 14, 2016
This comment has been minimized.
This comment has been minimized.
|
I reproduced locally the failure in #31646. This comparison from if let Ok(mut x) = "3.1415".parse::<f64>() {
assert_eq!(8.1415, { x += 5.0; x });
}fails because the 1e17: dd 44 24 5c fldl 0x5c(%esp) ; load the result of parse
1e1b: d8 86 a7 20 00 00 fadds 0x20a7(%esi) ; add 5.0 from memory
1e21: dd 54 24 10 fstl 0x10(%esp) ; store the result
1e25: 8d 44 24 10 lea 0x10(%esp),%eax
1e29: 89 44 24 4c mov %eax,0x4c(%esp)
1e2d: 8d 86 cf 20 00 00 lea 0x20cf(%esi),%eax
1e33: 89 44 24 50 mov %eax,0x50(%esp)
1e37: dd 86 9f 20 00 00 fldl 0x209f(%esi) ; load 8.1415
1e3d: d9 c9 fxch %st(1) ; exchange the two top elements of the FP stack (why???)
1e3f: da e9 fucompp ; compare the two top elements and pop bothIt is possible to fix this in several ways:
Since it does not look like the purpose of the test was to check for the rounding behaviour, I think that the test should be fixed/made more robust. |
This comment has been minimized.
This comment has been minimized.
|
This shouldn't be the only test that fails. Somehow you need to deal with this if you want a non-SSE2 target. In fact, the discussion of this problem prompted some platforms being changed to a pentium4 base CPU (though I don't know if 32 bit Darwin was among them or if it was already "yonah"). You're gonna have to deal with this somehow. The simplest possibility would be to disable the fast path on 32 bit Darwin, though this is a significant performance regression (at least an order of magnitude or two IIRC, there are float parsing benchmarks in |
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Feb 14, 2016
This comment has been minimized.
This comment has been minimized.
|
|
nagisa
reviewed
Feb 14, 2016
| @@ -12,7 +12,9 @@ use target::Target; | |||
|
|
|||
| pub fn target() -> Target { | |||
| let mut base = super::apple_base::opts(); | |||
| base.cpu = "yonah".to_string(); | |||
| // Use i686 as default CPU. Clang uses the same default. | |||
| base.cpu = "i686".to_string(); | |||
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 14, 2016
Contributor
According to the commit message that made this a yonah
Use more specific target CPUs on Darwin
Macs don't come with anything older than a Yonah (32bit) or Core2 (64bit), so we can default to those targets. Clang does the same.
I’m not sure why clang would change their default, but macs not existing with pre-yonah hardware seems like a pretty good reason to just use a yonah.
cc @dotdash
This comment has been minimized.
This comment has been minimized.
ranma42
Feb 14, 2016
Author
Contributor
See my comment above: Clang defaults to yonah on i386-apple-darwin, but not on i686-apple-darwin.
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 14, 2016
Contributor
@ranma42 but if there can’t possibly be such a combination of darwin+x86 which uses anything pre-yonah, why bother (EDIT: or, rather, restrict ourselves to) targeting a decade-older CPU?
This comment has been minimized.
This comment has been minimized.
dotdash
Feb 14, 2016
Contributor
@ranma42 Hm, where does clang do that distinction? Did you check the code or is there a command I could use to reproduce/check this (without owning a Mac that is ;-))
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
dotdash
Feb 14, 2016
Contributor
So if I'm reading the code correctly, there's some special handling for Darwin that disables the automatic CPU selection for any x86 target except for the i386 one. I wonder whether that's actually intentional.
This comment has been minimized.
This comment has been minimized.
ranma42
Feb 14, 2016
Author
Contributor
@nagisa I do not know why Clang restricts itself to a decade-older CPU; we are about to do it in order to be consistent with Clang. I agree with you that it is surprising to have a sub-optimal default on Mac and that is the reason why I was suggesting to provide an i386-apple-darwin triple as default target for 32-bit Mac.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 16, 2016
Member
Clang in this regard seems... somewhat inconsistent? At the very least it seems fine to leave this as-is and perhaps document the oddity (to allow this PR to land)
This comment has been minimized.
This comment has been minimized.
|
@rkruppe I confirm that |
This comment has been minimized.
This comment has been minimized.
|
I'm of two minds regarding fiddling with the FPU control word. On the one hand, it's nice to not leave the performance on the table. On the other hand, it's a very low level trick that has clear disadvantages especially when the target does have SSE2 (among other things, it's slightly slower, might inhibit compiler optimizations, and might break if optimizations get better). On the gripping hand, if we had a way to guarantee that this code path is only taken on targets without SSE2, even as target specs evolve, then I'd feel a lot better about it. |
This comment has been minimized.
This comment has been minimized.
|
It might be possible to only touch the FPU control word on |
This comment has been minimized.
This comment has been minimized.
|
Wait, |
This comment has been minimized.
This comment has been minimized.
|
I should mention that LLVM does a similar fiddling with the control word when casting floating point types to integer in order to ensure truncation (see the lines after 22872 in |
This comment has been minimized.
This comment has been minimized.
|
Aw, |
This comment has been minimized.
This comment has been minimized.
|
It does not work yet ;) |
bors
added a commit
that referenced
this pull request
Feb 15, 2016
bors
added a commit
that referenced
this pull request
Feb 15, 2016
This comment has been minimized.
This comment has been minimized.
|
Note, that IMHO claiming to follow “what Clang does” is a really brittle way forward. We should…
|
This comment has been minimized.
This comment has been minimized.
|
I think it would be convenient if the meaning of |
This comment has been minimized.
This comment has been minimized.
|
Internals forum would be a good place. |
This comment has been minimized.
This comment has been minimized.
|
Closing due to inactivity, but feel free to resubmit with the tests fixed! |
ranma42 commentedFeb 13, 2016
They now (should) match the behaviour of Clang and there is a brief comment in each documenting this.
As per discussion in #31110