New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure execution speed #133
Comments
This comment has been minimized.
This comment has been minimized.
Currently in Cranelift the IR verifier is enabled by default, which can take a lot of time. Can you benchmark with the "enable_verifier" setting disabled? |
This is just execution speed. |
Ah, please update the issue title then :-). Also, you may want to try setting Cranelift's opt_level to best. |
Compilation speed is at decent level already. Running hyperfine with opt_level set to best right now. Edit: doesn't seem to change much: |
This comment has been minimized.
This comment has been minimized.
At a high level, it's not too surprising that Cranelift's execution speed on Rust would be in the ballpark of LLVM's O0 on Rust, because it's not doing any inlining. The rough short-term plan is to enable the MIR inliner to help with this. There's probably a bunch of low-hanging fruit too, just making sure common Rust constructs are compiled well. |
Are you multithreading compilation? Cranelift is inherently very good at parallel compilation, |
@lachlansneff No, rustc's |
That's true, cranelift-codegen can be run with multiple instances in parallel, but cranelift-module doesn't yet make use of that. |
This comment has been minimized.
This comment has been minimized.
I took a quick look at the code. Here are some notes: Some of the code will get better once Cranelift has more support for i8 and the associated workarounds are removed. Is -Zmir-opt-level=3 in use when building libcore? I'm seeing things like If I'm reading this correctly, there's a small memmove in there, which the small memcpy/memmove/memset optimization should help with, once bytecodealliance/cranelift#586 is fixed. There's a codegen abort when I enable opt_level=best. I'll investigate that. |
Yes, the whole sysroot.
I am currently using my own code for copying locals: rustc_codegen_cranelift/src/common.rs Lines 416 to 442 in 29b4c34
So that memmove comes from the rustc_codegen_cranelift/src/intrinsics.rs Line 142 in e1fc9a5
Which likely came from
|
This comment has been minimized.
This comment has been minimized.
As a side note, it's interesting to see functions like |
Another perf issue: https://github.com/CraneStation/cranelift/issues/597 . |
Now that bytecodealliance/cranelift#598 is merged commit 8233ade enables
|
@sunfishcode Are there any obvious optimizations that we're missing here? |
This comment has been minimized.
This comment has been minimized.
@bjorn3 It looks like that's using the debug version of the codegen backend. Shouldn't that be the release version to maximize compilation speed? |
Here's a summary of the ideas from above for how we can improve performance from here:
|
Oops :) Benchmarking it in release mode atm.
And more importantly rustc_codegen_cranelift/src/common.rs Lines 418 to 419 in 0fa5c0f
|
Now with --release:
|
Yay, we are now technically a faster debug backend for rustc! There are a couple compile-time optimizations in the pipe, should improve this hopefully. |
Yes, at least on this small benchmark. |
To help inform us as to how excited we ought to be, is there a document somewhere describing the path that would need to be taken to get Cranelift upstreamed into rustc for use with debug builds? As far as we random onlookers know, it could be anywhere from "oh, it's basically done, we just need to flip a switch" to "years and years away, don't hold your breath". :) |
No, getting this even close to upstreaming is blocked on at least rust-lang/rust#55627 and supporting libstd (#146). Haven't spoken to any rust devs about this. I want to get a MVP first before making this more widely known.
This is not the case.
I hope not :) |
Minimized some outdated benchmark results, because they are long. |
I may open more focused issues in the future. |
bjorn3 commentedNov 2, 2018
No description provided.
The text was updated successfully, but these errors were encountered: