Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upPerformance regression in pattern matching #36283
Comments
pmarcelll
changed the title
Regression in pattern maching
Regression in pattern matching
Sep 5, 2016
pnkfelix
changed the title
Regression in pattern matching
Performance regression in pattern matching
Sep 5, 2016
sfackler
added
regression-from-stable-to-beta
T-compiler
labels
Sep 5, 2016
This comment has been minimized.
This comment has been minimized.
|
The timing suggests this might be an LLVM issue. We're getting more and more of these lately, which is concerning. |
brson
added
I-slow
A-LLVM
I-nominated
labels
Sep 13, 2016
This comment has been minimized.
This comment has been minimized.
|
Needs a P-tag. |
This comment has been minimized.
This comment has been minimized.
|
@nagisa also theorized that this was an LLVM problem (aliasing?). Do we have an easy way to test for that? |
This comment has been minimized.
This comment has been minimized.
|
Hmm. So I popped these examples into play and built the resulting assembly (on nightly, with optimizations enabled). They came out pretty similar:
|
This comment has been minimized.
This comment has been minimized.
|
Same results for beta: --- /home/nmatsakis/tmp/print-ok-beta 2016-09-15 12:22:36.604531321 -0400
+++ /home/nmatsakis/tmp/print-ref-beta 2016-09-15 12:23:04.984530867 -0400
@@ -10,6 +10,7 @@
.cfi_def_cfa_offset 80
cmpb $1, %dil
jne .LBB0_1
+ andb $3, %sil
cmpb $1, %sil
je .LBB0_7
cmpb $2, %sil |
This comment has been minimized.
This comment has been minimized.
|
Is it possible we fixed this? I know we did some LLVM fixing. cc @eddyb, I think you managed most of that. |
This comment has been minimized.
This comment has been minimized.
|
That said, stable builds are identical. Though I find it hard to imagine that the |
This comment has been minimized.
This comment has been minimized.
|
It could easily be something else, identical binaries are no guarantee, so the reduction could be unrelated. EDIT: as for LLVM, nothing I've seen would suggest any sort of improvements with |
This comment has been minimized.
This comment has been minimized.
|
Also, stable vs nightly look quite different, even if changing between the two versions of the arms makes |
This comment has been minimized.
This comment has been minimized.
|
@pmarcelll how confident are you that this reduction is accurate? can you make a version of the original benchmark available? |
This comment has been minimized.
This comment has been minimized.
|
Here's the code: link. |
This comment has been minimized.
This comment has been minimized.
|
The problematic part is in the hottest code path, so the slowdown is really noticable. |
This comment has been minimized.
This comment has been minimized.
|
Also, if the enum is not |
This comment has been minimized.
This comment has been minimized.
|
Discussed at compiler team meeting, but we didn't have a good answer for priority since investigation seems to be ongoing. P-high for now, feel free to downgrade if we get new info. |
nrc
added
P-high
and removed
I-nominated
labels
Sep 15, 2016
nrc
assigned
nikomatsakis
Sep 15, 2016
This comment has been minimized.
This comment has been minimized.
|
Results of some investigation: I can reproduce a regression for this benchmark at the same point in the history, though its not as severe a regression. Gist with the (slightly modified, self-contained) bf interpreter benchmark, and the transcript + summary of my runs: https://gist.github.com/pnkfelix/9e727884890b1ec867d58bd280cbbf82 Interestingly, other small perturbations also remove the regression. In particular, even just pulling the body of that match arm into a separate method and marking that method |
This comment has been minimized.
This comment has been minimized.
|
If anyone has access to LLVM 3.8 that they can build Rust with, it would help to confirm that this is yet another LLVM 3.9 perf regression. |
This comment has been minimized.
This comment has been minimized.
MagaTailor
commented
Sep 20, 2016
•
This comment has been minimized.
This comment has been minimized.
|
@petevine Really useful, thanks! And so LLVM 3.9 delivers, yet another perf regression. |
This comment has been minimized.
This comment has been minimized.
|
As a followup note regarding the gist I posted: I have done a little digging into the generated assembly for the
In particular: the slower version on the left has an extra Anyway, at this point I'm inclined to agree with @eddyb that this is probably an LLVM issue, not a Rust compiler one. We could investigate the order and number of passes, or we could try to file an LLVM bug, but either way, I think we should downgrade the priority of this ticket. |
This comment has been minimized.
This comment has been minimized.
|
Inclined to agree. triage: P-medium |
rust-highfive
added
P-medium
and removed
P-high
labels
Sep 21, 2016
This comment has been minimized.
This comment has been minimized.
I think the path forward is to file an LLVM bug. Next steps:
@eddyb notes that majnemer is often willing to do bisect against LLVM code base on small self-contained test case. |
This comment has been minimized.
This comment has been minimized.
pnkfelix
added
regression-from-stable-to-stable
and removed
regression-from-stable-to-beta
labels
Sep 29, 2016
nikomatsakis
removed their assignment
Oct 4, 2016
Mark-Simulacrum
added
C-bug
C-enhancement
and removed
C-bug
labels
Jul 26, 2017
This comment has been minimized.
This comment has been minimized.
Boscop
commented
Aug 30, 2017
|
Any update on this? :) |
This comment has been minimized.
This comment has been minimized.
|
It wasn't fixed in LLVM 4.0, but it might be fixed in 5.0, I'll check when the support is merged. The ergonomics improvements in pattern matching might also improve on this. |
nagisa
referenced this issue
Nov 6, 2017
Closed
Rust generates guard instructions around match statements #45792
This comment has been minimized.
This comment has been minimized.
Boscop
commented
Apr 22, 2018
|
Is it fixed now? :) |
This comment has been minimized.
This comment has been minimized.
|
I tried it a couple of days ago with the BF interpreter I linked earlier and it was slow no matter what compiler flag I used. As I wrote in the first comment, the benchmark completed just under 22 seconds, now it ran for 30 seconds with only a |
This comment has been minimized.
This comment has been minimized.
Boscop
commented
Sep 30, 2018
|
Any update on this? |
This comment has been minimized.
This comment has been minimized.
|
I tried it when All in all, |
This comment has been minimized.
This comment has been minimized.
|
I just checked it on nightly and it seems to be fixed (even on the latest beta). I don't know what changed, but I looked at the assembly on Godbolt and the previously slow part is now equivalent to the old version. I even managed to get the lowest runtime so far, only 19.543 seconds, with If this regression is unlikely to return and it was indeed fixed in LLVM then it can be closed. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
The LLVM issue has been fixed by llvm-mirror/llvm@238b886, which has tests and comments to prevent this from regressing without corresponding backend fixes. As such, I think we can consider this resolved... |
nikic
closed this
Dec 29, 2018
This comment has been minimized.
This comment has been minimized.
|
@nikic I was thinking about the bug report in #36283 (comment), it looks like that one can also be closed. |
pmarcelll commentedSep 5, 2016
•
edited
I discovered the cause of a fairly big performance regression (from ~22s to ~29s in a specific benchmark) in my small BF interpreter and I succesfully minified the code:
On stable, both version of this code compiles to the exact same binary (and also on 1.9.0 and 1.10.0, both old-trans and MIR on each version respectively). On beta and nightly, the commented out code becomes slower. My understanding is that this should compile to the same binary, so this is incorrect behavior.
The last correct version is
rustc 1.12.0-nightly (7333c4ac2 2016-07-31), right before the update to LLVM 3.9.EDIT: I noticed that the minified example does compile to the same binary (I somehow got confused) so I added the
#[inline(never)].