Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add backend/gslice.d: struct slice optimization #6176

Merged
merged 1 commit into from Oct 7, 2016

Conversation

WalterBright
Copy link
Member

This enables 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering. This is just a start for this. I'm doing this in smaller steps to make it easier to track down causes of any regressions.

@WalterBright
Copy link
Member Author

WalterBright commented Oct 6, 2016

Currently blocked by #6174 in that the optimizations will not be enabled until 6174 is pulled.

@WalterBright
Copy link
Member Author

Given the code:

void foo(int[] a, int[] b, int[] c) {
    foreach (i; 0 .. a.length)
        a[i] = b[i] + c[i];
}

the inner loop currently compiles to:

LA:             mov     EAX,018h[ESP]
                mov     EDX,010h[ESP]
                mov     ECX,[EBX*4][EAX]
                add     ECX,[EBX*4][EDX]
                mov     ESI,020h[ESP]
                mov     [EBX*4][ESI],ECX
                inc     EBX
                cmp     EBX,01Ch[ESP]
                jb      LA

with this PR:

L1A:            mov     ECX,[EBX*4][EDI]
                add     ECX,[EBX*4][ESI]
                mov     0[EBX*4][EBP],ECX
                inc     EBX
                cmp     EBX,EDX
                jb      L1A

@WalterBright WalterBright changed the title add gslice optimization add backend/gslice.d: struct slice optimization Oct 6, 2016
Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok modulo some nits

else
{
sia[si].canSlice = false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complement is actually easier to write and read:

if (tysize(e->Ety) != REGSIZE || e->Eoffset != 0 && e->Eoffset != REGSIZE)
{
    sia[si].canSlice = false;
}

}
return;
}
default:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch with two branches is odd

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect to add more in the future as I make this optimization more capable.

sliceStructs_Gather(sia, b->Belem);
}

{
Copy link
Member

@andralex andralex Oct 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so... why this scope here? looks suspect - remove or explain in a comment (I assume it's because of the goto and the variables)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

{
sia[si].canSlice = false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complement is actually easier to write and read:

if (tysize(e->Ety) != REGSIZE || e->Eoffset != 0 && e->Eoffset != REGSIZE)
{
    sia[si].canSlice = false;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

for (int si = 0; si < sia_length; si++)
{
sia2[si + n].canSlice = false;
if (sia[si].canSlice)
Copy link
Member

@andralex andralex Oct 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!sia[si].canSlice)
{
    continue;
}
... bunch of code ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to avoid things like "not can slice", the negations are cognitively harder.

@WalterBright
Copy link
Member Author

Auto-merge toggled on

@andralex
Copy link
Member

andralex commented Oct 7, 2016

A note to everyone: the new review flow allows one to pull their own request if it was approved. This gives people the opportunity to mind the review nits before merging.

{
if (debugc) printf("sliceStructs()\n");
size_t sia_length = globsym.top;
SymInfo *sia = (SymInfo *)malloc(3 * sia_length * sizeof(SymInfo));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it 3? You only seem to be using 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a bitvector for the canSlice part at least.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 is because it is used for two arrays, sia[] and sia2[]. sia2[] can grow to twice the size of sia[], as symbols can get split into two.

}

if (!anySlice)
goto Ldone;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not even using the second part of the sia, nor the si0 in the first part. Looks like a bit vector to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it a struct so I can add more things to it later. It's very unlikely to get large, as it's the number of local symbols in a function, so size isn't a problem.

sia2[si + n].si0 = si + n;
++n;
any = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This body is complex, very low-level, does sth. unexpected (seems like it's replacing the symbol in the global symtab), and comes without explanation/comment. Please add at least a short sentence for the reasoning (what and why), instead of having readers spent 5min. to unpuzzle the low level code, and 10min. on understanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

snew->Ssymnum = si + n + 1;

sia2[si + n].canSlice = true;
sia2[si + n].si0 = si + n;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW si0 is always the same as the index in sia, so it seems redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's different, off by n.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is the only place where si0 gets set and it reads sia2[si + n].si0 = si + n;.


static char __file__[] = __FILE__; /* for tassert.h */
#include "tassert.h"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A brief overview of what this module does and how it does it would be really friendly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

//printf("replaced with:\n");
//elem_print(e);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what this Eoffset = 0 is trying to achieve.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

}
else
{
Symbol *s1 = globsym.tab[sia[si].si0 + 1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why + 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second slice is the next symbol.

@WalterBright
Copy link
Member Author

Auto-merge toggled off

@WalterBright
Copy link
Member Author

Auto-merge toggled on

@WalterBright WalterBright merged commit 68c35a6 into dlang:master Oct 7, 2016
@WalterBright WalterBright deleted the structslice1 branch October 7, 2016 06:07
@MartinNowak
Copy link
Member

MartinNowak commented Oct 10, 2016

Ah finally understood what this does, now that I read you use the term slice not only for enregistering slices but for structs in general, I'd highly recommend to use a different term.
With the wide use of the term slice in D, reading this code is just confusing.
How about splitStructs and gsplitstructs.c instead?

@dnadlinger
Copy link
Member

Regarding names: Is this SROA?

@WalterBright
Copy link
Member Author

@MartinNowak
Copy link
Member

According to Digger this caused Issue 17215 – ICE(cgcod.c:findreg) with SIMD and -O -inline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants