-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve copy propagation optimization #4906
Conversation
|
This probably should come with a test attached. This is scary that codegen is not more tested. This is a deterministic process and pretty damn hard to debug. |
|
It's already tested by the test suite, as an earlier version of my fix failed the test suite in multiple places. It is commonplace. There is a more general problem with the optimizer and code gen testing. It often passes unnoticed when an optimization doesn't happen - the tests don't prove negatives, only that the code works. One the reasons I want to switch it to D is to get some unit testing in there. |
|
LLVM test this kind of thing but running the output assembly (in textual form) through a checker. The checker can check that this are present or that thing are not present (as to verify optimizations). That is something worth considering. Anyway, i have no power over here, so what i'm saying is nothing more than a suggestion. |
|
Generating a text file and then comparing does sound like a convenient way to test. It's a good idea. |
|
For an idea of the effect of this, consider the program: Before: After: |
|
You could check that these extra mov are not present. For reference, here is how LLVM is doing it: https://github.com/llvm-mirror/llvm/blob/master/test/Transforms/SROA/basictest.ll#L594 See that they check that this optimization pass remove loads present in the IR. |
Looks like something I've encountered in the past. Great that you caught it. |
|
Awesome. Good to know I'm not just spouting nonsense. ;-) |
This kind of thing is exactly why I created that thread. It's been paying off nicely. Low hanging fruit FTW. |
|
Meanwhile, using a modern compiler: ;) __D4test3fooFZv:
sub esp, 12
mov eax, dword ptr [_D4test3fooFZ2a1G2k]
call _D4test3barFkZv
mov eax, dword ptr [_D4test3fooFZ2a1G2k+4]
call _D4test3barFkZv
add esp, 12
retSCNR (Yes, I know that you know that I know that loop unrolling is not implemented in DMD at all.) |
|
@klickverbot see this one: #4909 |
|
@WalterBright: Looks better. ;) Can't really comment on the implementation, though. |
|
yah, the only difference is the loop isn't unrolled |
improve copy propagation optimization
Turns out H. S. Teoh was right. The copy propagation optimization was not working as well as it could for data that was larger than one register in size. Formerly, the sizes had to match exactly. Now the new size only has to be a subset of the original to trigger the copy propagation.
Some small informal tests showed significant gains from this.