New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add backend/gslice.d: struct slice optimization #6176
Conversation
Currently blocked by #6174 in that the optimizations will not be enabled until 6174 is pulled. |
Given the code:
the inner loop currently compiles to:
with this PR:
|
3b35a15
to
e3af5a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok modulo some nits
else | ||
{ | ||
sia[si].canSlice = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complement is actually easier to write and read:
if (tysize(e->Ety) != REGSIZE || e->Eoffset != 0 && e->Eoffset != REGSIZE)
{
sia[si].canSlice = false;
}
} | ||
return; | ||
} | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch with two branches is odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect to add more in the future as I make this optimization more capable.
sliceStructs_Gather(sia, b->Belem); | ||
} | ||
|
||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so... why this scope here? looks suspect - remove or explain in a comment (I assume it's because of the goto and the variables)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
{ | ||
sia[si].canSlice = false; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complement is actually easier to write and read:
if (tysize(e->Ety) != REGSIZE || e->Eoffset != 0 && e->Eoffset != REGSIZE)
{
sia[si].canSlice = false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
for (int si = 0; si < sia_length; si++) | ||
{ | ||
sia2[si + n].canSlice = false; | ||
if (sia[si].canSlice) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!sia[si].canSlice)
{
continue;
}
... bunch of code ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to avoid things like "not can slice", the negations are cognitively harder.
e3af5a9
to
f98bcbe
Compare
Auto-merge toggled on |
A note to everyone: the new review flow allows one to pull their own request if it was approved. This gives people the opportunity to mind the review nits before merging. |
{ | ||
if (debugc) printf("sliceStructs()\n"); | ||
size_t sia_length = globsym.top; | ||
SymInfo *sia = (SymInfo *)malloc(3 * sia_length * sizeof(SymInfo)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it 3? You only seem to be using 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use a bitvector for the canSlice part at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 is because it is used for two arrays, sia[] and sia2[]. sia2[] can grow to twice the size of sia[], as symbols can get split into two.
} | ||
|
||
if (!anySlice) | ||
goto Ldone; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not even using the second part of the sia, nor the si0 in the first part. Looks like a bit vector to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it a struct so I can add more things to it later. It's very unlikely to get large, as it's the number of local symbols in a function, so size isn't a problem.
sia2[si + n].si0 = si + n; | ||
++n; | ||
any = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This body is complex, very low-level, does sth. unexpected (seems like it's replacing the symbol in the global symtab), and comes without explanation/comment. Please add at least a short sentence for the reasoning (what and why), instead of having readers spent 5min. to unpuzzle the low level code, and 10min. on understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
snew->Ssymnum = si + n + 1; | ||
|
||
sia2[si + n].canSlice = true; | ||
sia2[si + n].si0 = si + n; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW si0 is always the same as the index in sia, so it seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's different, off by n
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this is the only place where si0 gets set and it reads sia2[si + n].si0 = si + n;
.
|
||
static char __file__[] = __FILE__; /* for tassert.h */ | ||
#include "tassert.h" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A brief overview of what this module does and how it does it would be really friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
//printf("replaced with:\n"); | ||
//elem_print(e); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea what this Eoffset = 0 is trying to achieve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
} | ||
else | ||
{ | ||
Symbol *s1 = globsym.tab[sia[si].si0 + 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why + 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second slice is the next symbol.
Auto-merge toggled off |
f98bcbe
to
de21de2
Compare
de21de2
to
4b842de
Compare
Auto-merge toggled on |
Ah finally understood what this does, now that I read you use the term slice not only for enregistering slices but for structs in general, I'd highly recommend to use a different term. |
Regarding names: Is this SROA? |
Yes, it's SROA: http://digital.cs.usu.edu/~allan/AdvComp/Notes/earlyd/ |
According to Digger this caused Issue 17215 – ICE(cgcod.c:findreg) with SIMD and -O -inline. |
This enables 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering. This is just a start for this. I'm doing this in smaller steps to make it easier to track down causes of any regressions.