Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the uncontrolled blitting which defeats the copy contructor #10510

Closed
wants to merge 8 commits into from

Conversation

SSoulaimane
Copy link
Member

@SSoulaimane SSoulaimane commented Oct 26, 2019

Issue 20321.

LDC and GDC are also affected.

I chose this way of fixing it which keeps the initialization at the semantic phase then just adapts it in the codegen, but I realized that the backend creates a lot of temporaries and copies into them. There has to be some means to do elaborate copy in the backend, I'm afraid there is no workaround.

CC: @WalterBright
CC: @kinke @ibuclaw are you going to solve this for LDC and GDC? How?

@SSoulaimane SSoulaimane changed the title Fix the uncontrolled bltting which defeats the copy contructor Fix the uncontrolled blitting which defeats the copy contructor Oct 26, 2019
@ibuclaw
Copy link
Member

ibuclaw commented Oct 26, 2019

What's the frontend representational dump of this? By and large the code gen just does what it is told, and only creates temporaries when a side effect needs to be evaluated multiple times.

@SSoulaimane
Copy link
Member Author

SSoulaimane commented Oct 26, 2019

Here is an example which illustrates the need to call the copy constructor in the codegen.
compiling this snippet

Container[] darr;
darr ~= v;

hits the following lines

dmd/src/dmd/e2ir.d

Lines 3315 to 3326 in 37a0434

if (e2.Eoper != OPvar && e2.Eoper != OPconst)
{
// Evaluate e2 and assign result to temporary s2.
// Do this because of:
// a ~= a[$-1]
// because $ changes its value
type* tx = Type_toCtype(tb2);
Symbol *s2 = symbol_genauto(tx);
e2x = elAssign(el_var(s2), e2, tb1n, tx);
e2 = el_var(s2);
}

The codegen evaluates v into a temporary first then moves it into the array. It should call the copy constructor or ideally the move constructor when it moves from the temporary into the array.

@SSoulaimane
Copy link
Member Author

SSoulaimane commented Oct 26, 2019

@ibuclaw That's true, the copying is the problem, it doesn't call the copy constructor when it happens.
I hand-picked those test cases by following usage of doCopyOrMove() you can start with the low hanging fruit which is struct function arguments for Posix and Win64. For Win32 calling conventions you need more backend intervention to get it right, but on the bright side GCC and Clang already do it correctly with C++.

union
{
Symbol* Edtor; // OPctor: destructor
Symbol* Edecl2; // VarDeclaration being constructed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is Edecl2 valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edecl2 holds the temporary symbol being constructed. It's just used to find which this nodes correspond to which ctor node without assuming much about the tree. Nodes which share the same Edecl2 symbol are related together.

@WalterBright
Copy link
Member

Issue 20321

Needs to be in the comment message and the subject, or the bot won't work.

src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
src/dmd/e2ir.d Outdated Show resolved Hide resolved
version (MARS)
{
case OPstrctor:
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this block of code doing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same thing it does for DMC, see line 4172.

Copy link
Member Author

@SSoulaimane SSoulaimane Oct 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does the work for the Win32 ABI. It fills OPstrthis nodes with the offset of the stack arguments, we need this to be able to call the copy constructor directly on the area where arguments are pushed. DMC does this with non-POD C++ arguments and it works.

@WalterBright
Copy link
Member

Note this is quite a bit more complex than the ldc solution: ldc-developers/ldc#3204

Which one is better?

@ibuclaw
Copy link
Member

ibuclaw commented Oct 26, 2019

Some observations:

  1. gdc-9 doesn't have the new cpctor feature, so of course things will work differently. As the frontend doesn't report the struct as non-POD, it will be copied around in the backend.

  2. It's the frontend that's adding in the temporaries, this is a diff between gdc-9 and gdc-master.

 Container get()
 {
 	Container a = a = 0 , a.this(1);
-	Container b = a;
+	Container b = b = 0 , b.this(a);
 	a.check();
 	b.check();
 	if (1)
-		return a;
+		return (Container __copytmp2 = __copytmp2 = 0 , __copytmp2.this(a);) , __copytmp2;
 	else
-		return b;
+		return (Container __copytmp3 = __copytmp3 = 0 , __copytmp3.this(b);) , __copytmp3;
 }
 void main()
 {
 	Container v = v = 0 , v.this(1);
 	v.check();
-	func(v);
+	func(((Container __copytmp5 = __copytmp5 = 0 , __copytmp5.this(v);) , __copytmp5));
 	Container r = get();
 	r.check();
-	Container[1] slit = [v];
+	Container[1] slit = [((Container __copytmp6 = __copytmp6 = 0 , __copytmp6.this(v);) , __copytmp6)];
 	slit[0].check();
-	Container[] dlit = [v];
+	Container[] dlit = [((Container __copytmp7 = __copytmp7 = 0 , __copytmp7.this(v);) , __copytmp7)];
 	dlit[0].check();
-	B b = B(v);
+	B b = B(((Container __copytmp8 = __copytmp8 = 0 , __copytmp8.this(v);) , __copytmp8));
 	b.m.check();
 	return 0;
 }
  1. In gdc-master, the frontend knows about, and reports cpctors as making a struct non-POD, and so these are not copied when passed around directly. The error on line 20 therefore doesn't happen.

@ibuclaw
Copy link
Member

ibuclaw commented Oct 26, 2019

Regarding the other failures (from the bug report)

test.d(53): error
test.d(56): error

get() fails because the frontend insists on returning a __copytmp. I think it would be much better if the frontend recognized that any struct with a cpctor returned is always nrvo, even if there are multiple possible variables.


test.d(59): error
test.d(62): error

Arrays fail because again of the __copytmps, it would be better if the frontend recognized and passed the address of all indexes to the cpctor instead. i.e:

// Not this
Container[1] slit = [((Container __copytmp6 = __copytmp6 = 0 , __copytmp6.this(v);) , __copytmp6)]

// But this
Container[1] slit;
slit[0].this(v);
// slit[1].this(v) ... repeat for other elements

test.d(65): error

Structs with non-POD members are again failing because of __copytmp, frontend should recognize and pass address of each relevant field to the cpctor. i.e:

// Not this
B b = B(((Container __copytmp8 = __copytmp8 = 0 , __copytmp8.this(v);) , __copytmp8));

// But this
B b;
b.f1.this(v);
// b.f2.this(v) ... repeat for other fields.

@kinke
Copy link
Contributor

kinke commented Oct 26, 2019

are you going to solve this for LDC and GDC? How?

It can be solved for LDC, pretty easily as Walter noted, by

  1. changing the extern(D) ABI to pass non-POD value arguments by reference under the hood (just like the C++ ABI, except on Win32), and
  2. by detecting constructions from temporaries (T var = (T __copytmp = … , … , __copytmp)) and replacing the temporary's lvalue/alloca by the final lvalue (replacing &__copytmp by &var), after toElem()enting the CommaExp as usual; that's a neat feature of LLVM's IR (rhsLVal->replaceAllUsesWith(lhsLVal)).

@SSoulaimane
Copy link
Member Author

@WalterBright LDC doesn't solve Win32 ABI yet, this solves if for DMD. For the rest it's similar in concept to how LDC does it, LDC has this function https://github.com/ldc-developers/ldc/blob/baaef17472bdeae90f80fb5c4f3b0212971db7a5/gen/toir.cpp#L2767 which is called a number of times. This is a DMD variant using OPstrctor. Since we don't have in the backend a handy container for holding and representing values that is used ubiquitously like LDC uses DLValue, then trying to mimic LDC here will involve building something like DLvalue and spreading it across the backend.

@SSoulaimane
Copy link
Member Author

@ibuclaw if you don't have the copy constructor try with the postblit it's the same.

@SSoulaimane
Copy link
Member Author

@ibuclaw In summary you're proposing to move much of the work done in the glue layer into the semantic phase. I'm not sure that would end the story though unless a no copying rule is imposed on the glue and the codegen.

@WalterBright
Copy link
Member

I'm uncomfortable with changing the back end like this. For one thing, it's effect on the optimizer is unknown. By doing it in the glue layer, these concerns would vanish. I'm wondering if the code added to cgelem.d could be done in e2ir.d.

@SSoulaimane
Copy link
Member Author

@WalterBright it was literally doing the exact same thing in e2ir.d inside elAssign(), it just didn't blend there naturally. It's like moving something which naturally fits in cgelem.d to e2ir, it will work for sure but it would raise questions why is tree shifting in e2ir.d. These additions only affect the OPstrctor and OPstrthis nodes which were only used in DMC. This is as clean as I can think of right now. Other than this I can do some manual work in a dozen locations, it wouldn't be as clean though.

@SSoulaimane SSoulaimane force-pushed the copy0 branch 3 times, most recently from d612b21 to 1f8be0e Compare October 30, 2019 16:20
@ibuclaw
Copy link
Member

ibuclaw commented Oct 30, 2019

@SSoulaimane, of course if dmd is creating extra temporaries under the hood, then those need addressing independently too.

From my end, having an unambiguous way of representing copy constructor assignments is all that's required. I know what the correct codegen to fix in gdc looks like - just generate the same trees as g++, and gcc backend will take care of the rest.

@SSoulaimane
Copy link
Member Author

It still has to use a temporary variable and you have to replace it, or does it need to be some other special AST node instead of a variable?

@ibuclaw
Copy link
Member

ibuclaw commented Oct 30, 2019

Only in the case of return statements though?

Correct me if I remember wrong, in any step here.

In the most simplistic case, assignments can be expressed in the following AST.

a = b; // AssignExp(a, b)
auto a = b; // ConstructExp(a, b)

Therefore one can express

a = c; // CopyConstructExp(a, c, __cpctor)

This also works fine if a = c is an array type, as the codegen can generate a call to the cpctor on each element in a.

In the case of return expressions

return c; // ReturnStatement(CopyConstructExp(tmp, c, __cpctor))

If there's a way to mark temporary var decls as being the value of a return statement (STCtemp | STCreturnvalue) then we can set-up tmp as just being an alias to the return slot <retval>. Being a decl with a cpctor, the type is marked as addressable, which tells the backend to not create temporaries/elide copies or else ICE (because we generated code wrong/in a way that couldn't avoid a temporary).

Leaving the complex case for last, objects with cpctor/postblit fields.

There's already recognition of:

a = A(1); // ConstructExp(a, A().__ctor(1))

Which sets a as the target of A(), and so gets written directly to the variable instead of a temporary, and subsequently calls a.__ctor(1).

Going from there to the following:

a = A(c, 1); // ConstructExp(a, A(c, 1))

Is not that big a step, albeit the difference is in the above you must expand the constructor to individual direct assignments.

a = 0; // memset()
a.f1.__cpctor(c);
a.f2 = 1;

If you are going this far, then might as well have this expansion in the frontend.

All that I mention above though goes on the assumption that the frontend won't insert magical __tmp vars, which only serves to over complicate the matter.

@SSoulaimane
Copy link
Member Author

SSoulaimane commented Oct 30, 2019

The selling point of the current way doCopyOrMove() does it, is the frontend takes care of selecting whether there is a copy constructor or a postblit and checks whether to actually call it or not, but it's not much work really it can fit in a small function in the glue, or some helpers defined in the frontend and called from the glue. If the expression is a function call which returns on the stack you use the hidden pointer directly no need to call the copy contructor, likewise with struct literals, for the rest there is default initialization followed by copy construction.

Only in the case of return statements though?

Everywhere you find doCopyOrMove() being called, the destination is unknown to the frontend, this includes function arguments, array and struct literals, concatenation expressions, and the hidden parameter for function returns.

Your solution simply sets a function to be called and lets the glue do the logic. In this case you don't really need a new expression AST node, you only need to know the function to be called which can be set on each aggregate type during type semantic, the correct overload of the copy constructor will be picked and the glue decides when and if to call it.

@SSoulaimane
Copy link
Member Author

SSoulaimane commented Oct 31, 2019

@ibuclaw semaphore failure is caused by this GDC bug https://issues.dlang.org/show_bug.cgi?id=20342.

@ibuclaw
Copy link
Member

ibuclaw commented Oct 31, 2019

The codegen is still better placed to know when a temporary value is needed or not. The current behaviour of the frontend generating ast that creates a temporary only for the codegen to say actually you didn't mean that and then proceed to disentangle the mess that it has consumed is not a desirable solution.

As far as I can tell, the only reason that the frontend creates the whole (tmp = 0, tmp.__cpctor(a), tmp or (tmp = a, tmp.__postblit(), tmp) is for want of a better way to represent the action of constructing without copying. Both could be simplified into one ast node. Then the glue will provide the ultimate target of the expression, i.e

memset(&<retval>, 0, 16);
__cptor(&<retval>, a);
return <retval>;

@SSoulaimane
Copy link
Member Author

SSoulaimane commented Oct 31, 2019

@ibuclaw This approach works. But in general I can see the benefit of moving things to the fronted or to a pre glue phase which is shared between compilers. This would allow GDC and LDC to catchup much quicker with DMD and focus more on the specificities of each backend rather than the repetitive stuff.

@SSoulaimane SSoulaimane force-pushed the copy0 branch 3 times, most recently from 8c418a1 to b81057c Compare November 1, 2019 14:53
This applies to test case `test7516b()` from `test/runnable/sdtor.d`.
non POD struct arguments are passed by ref, static arrays of the same should receive the same treatment since the postblit applies to them too.
This issue affects this work too. non POD structs have to be passed by ref for function arguments.
@SSoulaimane SSoulaimane force-pushed the copy0 branch 2 times, most recently from e564a39 to d9c9f41 Compare November 2, 2019 00:54
@RazvanN7
Copy link
Contributor

This seems to have been abandoned. The issue is referencing this PR, therefore anyone that picks it up, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants