Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upprintln!() prevents optimization by capturing pointers #50519
Comments
sanxiyn
added
the
I-slow
label
May 19, 2018
Aaronepower
added
C-enhancement
A-codegen
T-compiler
labels
Oct 2, 2018
This comment has been minimized.
This comment has been minimized.
|
In this reduced example, minss is generated for both cases since rust 1.25: https://godbolt.org/z/wz8Kmk Replacing the return with println brings the problem back. |
nikic
added
the
A-LLVM
label
Dec 15, 2018
This comment has been minimized.
This comment has been minimized.
|
Okay, the relevant difference that Taking the address prevents the conversion of lowest from an alloca into an SSA value, and that's going to inhibit lots of optimizations (including the select formation desired here). The good news is that this is probably not going to affect real code much, though I am concerned about cases where you have conditional debugging code that includes formatting. Two ways this could be fixed:
|
nikic
changed the title
Missed optimization: Compiler sometimes emits float compare + jump instead of MINSS/MAXSS
println!() prevents optimization by capturing pointers
Dec 23, 2018
This comment has been minimized.
This comment has been minimized.
|
@rkruppe @nagisa @eddyb Any idea what we can do here? I think it's pretty bad that It would be great if we could force a copy of the formatted value before taking the pointer, but I'm not sure how to do that on a technical level. We'd only want to do this for specific types (integers and floats), but println! is expanded long before this type information is available. |
This comment has been minimized.
This comment has been minimized.
|
If changing how |
This comment has been minimized.
This comment has been minimized.
The formatting machinery has been specifically crafted to minimize the size rather than increase the speed (desired for panics), which will eventually come at some cost somewhere, which is what we are seeing here. If we can find ways to improve |
df5602 commentedMay 7, 2018
This weekend I ran some benchmarks on some of my code. After making a seemingly insignificant code change I noticed a small, but measurable performance regression. After investigating the generated assembly, I stumbled upon a case, where the compiler emits code that is not optimal.
This minimal example shows the same behaviour (Playground link):
When compiling with the
--releaseflag, the compiler generates the following instructions for the marked block:However, if I replace those lines with the following:
the compiler emits a strange series of float compare and jump instructions:
As a comparison, both gcc and clang can optimize a similar C++ example:
Both compilers generate
minssinstructions for both variants.(Godbolt)
I wasn't sure whether rustc or LLVM were responsible for this behaviour, however after a quick glance at the generated LLVM IR, I'm tending towards rustc, since in the first case it emits
fcmpandselectinstructions, while in the latter it generatesfcmpandbr.What do you think?