Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posix x86[_64]: Pass non-POD arguments indirectly by value, not just for extern(C++), and get rid of extra blits for temporaries #3204

Merged
merged 4 commits into from
Nov 3, 2019

Conversation

kinke
Copy link
Member

@kinke kinke commented Oct 23, 2019

Streamlining the D ABI with the C++ one and preventing extra blits for rvalue arguments. Non-POD arguments passed by value aren't blitted into the callee params stack any longer, but passed by reference, either directly (rvalue args) or as reference to a bitcopy on the caller stack (lvalue args).

This is also a prerequisite for a potential DIP1014 (opPostMove) implementation, as LLVM (via its byval attribute) has been performing the move/blit implicitly as part of the call, with no way for us to invoke the post-move hook afterwards in the callee with the address of the original argument.

@kinke kinke added the B-abi label Oct 23, 2019
@kinke
Copy link
Member Author

kinke commented Oct 26, 2019

I've added an experimental in-place-construction extension which gets rid of more blits, for all targets.

}
}

void func(Container c) { c.check(); } // error
Copy link
Member Author

@kinke kinke Oct 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check requires the ABI change in the 1st commit to prevent blitting the copy-constructed temporary argument in line 65.

auto rhsLVal = DtoLVal(rhs);
rhsLVal->replaceAllUsesWith(lhsLVal);
return true;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works with Posix but Win32 will be more interesting, since you have to construct the arguments directly on their push area.

Copy link
Member Author

@kinke kinke Oct 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've looked with horror at our 32-bit x86 ABI implementation yesterday and hoped it was just a bug that non-PODs are passed on the stack for Win32. But that's indeed what should happen, and the way to implement that in LLVM IR is going to be a pain (allocating the full params stack on the caller, as an anonymous struct); clang produces this.

Copy link
Member

@dnadlinger dnadlinger Oct 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also https://llvm.org/docs/InAlloca.html. (This is pretty ugly.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugly and unsuited for the current ABI-related LDC code architecture. I definitely don't care enough about Win32 to implement this. Right now, the compiler is free to blit around as much as it wants, so this is just codegen optimization. For extern(D), we could simply violate Walter's ABI and pass non-PODs by ref.

kinke added 3 commits October 26, 2019 16:43
…r extern(C++)

Streamlining the D ABI with the C++ one and preventing extra blits for
rvalue arguments. Non-POD arguments passed by value aren't blitted into
the callee params stack any longer, but passed by reference, either
directly (rvalue args) or as reference to a bitcopy on the caller stack
(lvalue args).

This is also a prerequisite for a potential DIP1014 (opPostMove)
implementation, as LLVM (via its `byval` attribute) has been performing
the move/blit implicitly as part of the call, with no way for us to
invoke the post-move hook afterwards in the callee with the address of
the original argument.
@kinke
Copy link
Member Author

kinke commented Oct 26, 2019

[I've verified the 3rd commit in my Linux VM, as CI doesn't run the 32-bit LIT tests on Posix.]

@kinke kinke changed the title Posix x86_64: Pass non-POD arguments indirectly by value, not just for extern(C++) Posix x86[_64]: Pass non-POD arguments indirectly by value, not just for extern(C++), and get rid of extra blits for temporaries Oct 26, 2019
As the replacement most likely wouldn't work if the temporary lvalue was
a bitcast alloca, for example. We want to replace all usages of the
alloca to make sure all writes are redirected to the final lvalue.

auto b = B(v);
b.m.check(); // error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't solve the following test case in DMD yet, it seems this PR doesn't solve it for LDC either.

Container[] darr;
darr ~= v;
darr[0].check(); // error

This is the machinery work

ldc/gen/arrays.cpp

Lines 825 to 844 in 058202a

// Evaluate the expression to be appended first; it may affect the array.
DValue *expVal = toElem(exp);
// The druntime function extends the slice in-place (length += 1, ptr
// potentially moved to a new block).
LLFunction *fn = getRuntimeFunction(loc, gIR->module, "_d_arrayappendcTX");
gIR->CreateCallOrInvoke(
fn, DtoTypeInfoOf(arrayType),
DtoBitCast(DtoLVal(array), fn->getFunctionType()->getParamType(1)),
DtoConstSize_t(1), ".appendedArray");
// Assign to the new last element.
LLValue *newLength = DtoArrayLen(array);
LLValue *ptr = DtoArrayPtr(array);
LLValue *lastIndex =
gIR->ir->CreateSub(newLength, DtoConstSize_t(1), ".lastIndex");
LLValue *lastElemPtr = DtoGEP1(ptr, lastIndex, ".lastElem");
DLValue lastElem(arrayType->nextOf(), lastElemPtr);
DtoAssign(loc, &lastElem, expVal, TOKblit);
callPostblit(loc, exp, lastElemPtr);

DMD does the same thing. First evaluate into a temporary, expand the array, then assign to darr[$-1].

Any idea on how to solve it?

Copy link
Member Author

@kinke kinke Oct 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it might be better to let the frontend lower the CatAssignExp to the druntime call for GC expansion and then copy-constructing the last element (copy ctor or blit+postblit). IIRC, there are a few more places where the glue layer needs to insert postblit calls (and where it should now call the copy ctor if available, and in the future possibly the move ctor...), and ideally we'd get rid of all of them.

@kinke kinke merged commit d45027d into ldc-developers:master Nov 3, 2019
@kinke kinke deleted the sysv branch November 3, 2019 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants