Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upAdding --emit=asm speeds up generated code #57235
Comments
This comment has been minimized.
This comment has been minimized.
|
AFAIK this forces rustc to use a single codegen unit. This generally makes code run faster because every function is available for inlining, although I don't see what might cause such a drastic difference. You can try to reproduce by using |
This comment has been minimized.
This comment has been minimized.
|
Indeed -Ccodegen-units=1 fixes the problem. It's pretty surprising/dangerous that --emit=asm changes the generated code. Why is a single codegen unit forced with --emit=asm? |
This comment has been minimized.
This comment has been minimized.
|
This was done in #30208. Multiple codegen units would result in multiple compilation outputs, which is generally not expected when using EDIT: Also see #30063, which is now obsolete since the build system changed, but the discussion there is still relevant. |
This comment has been minimized.
This comment has been minimized.
|
@jrmuizel I think the answer is that nobody has implemented the necessary handling for that. If there are multiple codegen units and we want to produce a single artifact, we'd have to merge the LLVM modules prior to emitting IR/BC/asm (unless LTO, either thin or fat, already takes care of that). I agree that the current behavior is not great, as these are often used for debugging performance issues and changing the number of codegen-units can impact optimization a lot. |
This comment has been minimized.
This comment has been minimized.
|
One thing to note is that the difference in speed listed here is a bit artificial; this program doesn't actually do anything in the |
jrmuizel commentedDec 31, 2018
With the following rust code:
I get: