Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upMeasure execution speed #133
Comments
This comment was marked as outdated.
This comment was marked as outdated.
|
As of b533f91:
Comparing only cc @sunfishcode (You may be interested in the speed of cranelift. Please note that cg_clif itself is not yet optimized for output quality.) |
This comment has been minimized.
This comment has been minimized.
|
Currently in Cranelift the IR verifier is enabled by default, which can take a lot of time. Can you benchmark with the "enable_verifier" setting disabled? |
This comment has been minimized.
This comment has been minimized.
|
This is just execution speed. |
This comment has been minimized.
This comment has been minimized.
|
Ah, please update the issue title then :-). Also, you may want to try setting Cranelift's opt_level to best. |
bjorn3
changed the title
Measure compilation speed
Measure execution speed
Nov 3, 2018
This comment has been minimized.
This comment has been minimized.
|
Compilation speed is at decent level already. Running hyperfine with opt_level set to best right now. Edit: doesn't seem to change much: |
This comment was marked as outdated.
This comment was marked as outdated.
|
Compilation speed of libcore without optimizations for cg_clif and cg_llvm:
Edit: wait that's with semi debug mode compiled cg_clif. (cranelift itself is optimized but cg_clif is not) Edit2: Using release mode for cg_clif:
|
This comment has been minimized.
This comment has been minimized.
|
At a high level, it's not too surprising that Cranelift's execution speed on Rust would be in the ballpark of LLVM's O0 on Rust, because it's not doing any inlining. The rough short-term plan is to enable the MIR inliner to help with this. There's probably a bunch of low-hanging fruit too, just making sure common Rust constructs are compiled well. |
This comment has been minimized.
This comment has been minimized.
lachlansneff
commented
Nov 8, 2018
|
Are you multithreading compilation? Cranelift is inherently very good at parallel compilation, |
This comment has been minimized.
This comment has been minimized.
|
@lachlansneff No, rustc's |
This comment has been minimized.
This comment has been minimized.
|
That's true, cranelift-codegen can be run with multiple instances in parallel, but cranelift-module doesn't yet make use of that. |
This comment was marked as outdated.
This comment was marked as outdated.
|
As of ef5d161
Note: this is using a different computer than the previous benchmark
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
I took a quick look at the code. Here are some notes: Some of the code will get better once Cranelift has more support for i8 and the associated workarounds are removed. Is -Zmir-opt-level=3 in use when building libcore? I'm seeing things like If I'm reading this correctly, there's a small memmove in there, which the small memcpy/memmove/memset optimization should help with, once CraneStation/cranelift#586 is fixed. There's a codegen abort when I enable opt_level=best. I'll investigate that. |
This comment has been minimized.
This comment has been minimized.
Yes, the whole sysroot.
I am currently using my own code for copying locals: rustc_codegen_cranelift/src/common.rs Lines 416 to 442 in 29b4c34 So that memmove comes from the rustc_codegen_cranelift/src/intrinsics.rs Line 142 in e1fc9a5 Which likely came from
|
This comment was marked as outdated.
This comment was marked as outdated.
|
Repro of that codegen abort:
|
This comment has been minimized.
This comment has been minimized.
lachlansneff
commented
Nov 11, 2018
|
As a side note, it's interesting to see functions like |
This comment has been minimized.
This comment has been minimized.
|
Another perf issue: CraneStation/cranelift#597 . |
This comment has been minimized.
This comment has been minimized.
|
Now that CraneStation/cranelift#598 is merged commit 8233ade enables
|
This comment has been minimized.
This comment has been minimized.
lachlansneff
commented
Nov 16, 2018
|
@sunfishcode Are there any obvious optimizations that we're missing here? |
This comment was marked as outdated.
This comment was marked as outdated.
|
Compile time measure (somehow -Copt-level=2/3 for is faster than -Copt-level=0/1 for llvm. The first two are faster than clif and the later are slower):
|
added a commit
that referenced
this issue
Nov 16, 2018
This comment has been minimized.
This comment has been minimized.
lachlansneff
commented
Nov 16, 2018
|
@bjorn3 It looks like that's using the debug version of the codegen backend. Shouldn't that be the release version to maximize compilation speed? |
This comment has been minimized.
This comment has been minimized.
|
Here's a summary of the ideas from above for how we can improve performance from here:
|
This comment has been minimized.
This comment has been minimized.
Oops :) Benchmarking it in release mode atm.
And more importantly rustc_codegen_cranelift/src/common.rs Lines 418 to 419 in 0fa5c0f |
This comment has been minimized.
This comment has been minimized.
|
Now with --release:
|
This comment has been minimized.
This comment has been minimized.
lachlansneff
commented
Nov 16, 2018
•
|
Yay, we are now technically a faster debug backend for rustc! There are a couple compile-time optimizations in the pipe, should improve this hopefully. |
This comment has been minimized.
This comment has been minimized.
|
Yes, at least on this small benchmark. |
This comment has been minimized.
This comment has been minimized.
bstrie
commented
Nov 16, 2018
•
|
To help inform us as to how excited we ought to be, is there a document somewhere describing the path that would need to be taken to get Cranelift upstreamed into rustc for use with debug builds? As far as we random onlookers know, it could be anywhere from "oh, it's basically done, we just need to flip a switch" to "years and years away, don't hold your breath". :) |
This comment has been minimized.
This comment has been minimized.
No, getting this even close to upstreaming is blocked on at least rust-lang/rust#55627 and supporting libstd (#146). Haven't spoken to any rust devs about this. I want to get a MVP first before making this more widely known.
This is not the case.
I hope not :) |
This comment has been minimized.
This comment has been minimized.
|
Minimized some outdated benchmark results, because they are long. |

bjorn3 commentedNov 2, 2018
No description provided.