-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Description
| Bugzilla Link | 32548 |
| Version | trunk |
| OS | Linux |
| Attachments | reduced test case, LSR debug output, An even smaller test case with 3+1 stores that LSR fails with |
| CC | @atrick,@hfinkel,@JonPsson,@qcolombet,@uweigand |
Extended Description
In the spirit of healthy competition between gcc and llvm, I have found a a regression on a benchmark. At least one of the problems is that LSR is not producing good code (while gcc indeed does).
On SystemZ, some memory instructions only support immediate offset of 12 bits. This is the reason for the hook isFoldableMemAccessOffset(), which adds extra cost if the offset does not fit.
This hook isn't enough to help this test case. LSR produces big offsets even as extra costs are added for them. I tried adding experimental big costs, and even tried incrementing NumRegs in LSR when such an offset was present, but this did not help. I suspect that either the right formula isn't generated in the first place, or there is some other heuristic that unfortunately takes precedence somehow.
The cost on SystemZ for a too big offset for a memory access is one extra instruction to add the offset.
I would appreciate any help on this,
Jonas
Run with:
bin/llc -mtriple=s390x-linux-gnu -mcpu=z13