Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stable] Fix Issue 19134 - [C++] static const y = new Derived(); ->pointer cast from const(Derived) to immutable(void*)** is not supported at compile time #8533

Merged
merged 1 commit into from Aug 24, 2018

Conversation

kinke
Copy link
Contributor

@kinke kinke commented Aug 2, 2018

No description provided.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @kinke! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
19134 regression [C++] static const y = new Derived(); ->pointer cast from const(Derived) to immutable(void*)** is not supported at compile time

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "stable + dmd#8533"

@kinke kinke changed the title Fix Issue 19134 [stable] Fix Issue 19134 Aug 2, 2018
@wilzbach wilzbach changed the title [stable] Fix Issue 19134 [stable] Fix Issue 19134 - [C++] static const y = new Derived(); ->pointer cast from const(Derived) to immutable(void*)** is not supported at compile time Aug 2, 2018
// skip the vptr assignment during CTFE; it's not supported and superfluous,
// as an external super in C++ is non-CTFE-able anyway
// => __ctfe || cast(void) (this.__vptr = __vtbl)
ate = new LogicalExp(loc, TOK.orOr, new IdentifierExp(loc, Id.ctfe), new CastExp(loc, ate, Type.tvoid));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a potentially superior implementation would be to detect if super() is extern, and then skip the assignment on that basis. That way, even outside of CTFE, the redundant work to assign the vtable will be skipped in cases when it doesn't need to.

{
int a = 123;
this() { a += 42; }
void foo() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make foo return a and call it from the tests ?

@TurkeyMan
Copy link
Contributor

Possible to tweak this in the way I suggest to gain the runtime benefit also?

@kinke
Copy link
Contributor Author

kinke commented Aug 3, 2018

Thanks Manu, that's the better approach indeed. Implemented now and vptr test added.

@TurkeyMan
Copy link
Contributor

TurkeyMan commented Aug 3, 2018

This LGTM, given that we agree on the theory as I suggested...
But I'm actually having second thoughts.

In regular D, during the entire constructor chain, the vtable is set to the most-derived vtable.
with extern(C++), the most-derived vtable is assigned as usual, prior to entry of the constructors... but any call to a C++ base constructor will execute the C++ semantic of assigning the vtable for that level of construction prior to running the C++ constructor, and that's what leads to this issue here.

It occurs to me that, any extern(C++) constructor won't be able to determine whether the most-derived vtable is applied, of some super-class vtable is applied from a C++ base constructor.
I think that's a bad situation, it just feels unreliable to me.

So, what I'm actually thinking would be better is to revert this to your former solution, and instead implement logic that matches the C++ semantic; that is, any extern(C++) constructor will assign the vtable pointer for that type on function entry.
The effect being, extern(C++) constructors can not call more-derived virtuals (like C++, and unlike D).

I think the most reasonable thing to do here with vtable assignment semantics is to match the C++ rules, since we can't affect that behaviour when linking externally.

That said; that puts us back into the exact same situation where the vtable assignment on function entry may not CTFE? Why is that an issue? Can that be fixed some other way?

@TurkeyMan
Copy link
Contributor

...or just merge this patch :) ... it's probably fine!

@TurkeyMan
Copy link
Contributor

Well, let's say, do merge this patch (it's superior to the existing code), and then the logic I detail above may be implemented in a follow-up, if someone wants to do it.
I don't believe it's likely to cause any trouble in the meantime.

@kinke
Copy link
Contributor Author

kinke commented Aug 3, 2018

I think what happens during extern(C++) construction is this:

  • memory is allocated and initialized with init symbol (TypeInfo.initializer(), so with final vptr), either by GC or compiler
  • ctor is called
  • it calls its super() at some point, either implicitly or explicitly (=> recursively)

So by restoring the vptr after each external super() call (possibly implemented in C++), we should be fine.

@TurkeyMan
Copy link
Contributor

That's correct, but we don't restore it to the most-derived pointer (ie, the original pointer), we restore it to the vptr for the type at the level of the constructor being executed...

In fact... this won't work! With this logic, the most-derived vtable will never be restored if there is a D super-class. >_<

Is it possible to restore it to the most-derived ptr? I think we would need to take a local copy prior to calling an extern-super, and then restore it after.

@kinke
Copy link
Contributor Author

kinke commented Aug 3, 2018

Yep you're right, a copy seems necessary.

@TurkeyMan
Copy link
Contributor

Yeah. Let's do that.
The logic in this patch requires that we restore a copy of the vptr from before the call to super() to work correctly.
It's still correct that this work is only necessary if the super ctor is extern.

@kinke
Copy link
Contributor Author

kinke commented Aug 3, 2018

And it only works because C++ sets the vptr after calling its super and right before the actual ctor body, right? Otherwise, extern(C++) base classes above an actual C++ one may not have the final vptr set when entering their ctor.

@kinke
Copy link
Contributor Author

kinke commented Aug 3, 2018

So #8362 introduced this breaking change for ctors in extern(C++) classes. After the super call, the vptr is set to the class' of the current ctor, so virtual function calls inside the ctor are directed to the current class, like in C++, and not to the final implementations, D semantics. Was that intended?!

@TurkeyMan
Copy link
Contributor

And it only works because C++ sets the vptr after calling its super and right before the actual ctor body, right?

Yes.

Otherwise, extern(C++) base classes above an actual C++ one may not have the final vptr set when entering their ctor.

Right. If there's an actual C++ ctor in the stack, that call enters with the most-derived vptr, then before it assigns its local vptr, it will call the further super(). If THAT super() is a D function again, it will be entered still with the most-derived vptr.

The callstack will always reach the base ctor with the most-derived vptr assigned.

[...] Was that intended?!

I think we actually got confused somewhere along the way.

Our original intent was to preserve D semantics in the D constructors (which depends on this fix that we record and restore the vptr around calls to an extern super())

But IIRC, there was something that convinced us we need to go with C++ semantics all the way down the tree, and so we implemented it the way it is now.
But that was an imperfect implementation; to match C++ semantics, we need to ALSO assign the vptr on ctor entry, such that any code that executes prior to the call to super() is also using typeof(this) vtbl.

I think the reason we decided that the D ctor should match C++ semantics was that, in C++, override functions can be written to EXPECT that the class is constructed when they're called.
In D, override functions may not expect to be called on a fully-constructed class. (a terrible design if you ask me! but it is what it is)

Imagine a D base class calls a virtual implemented in C++, which expects that it is constructed, but because it was called from a D base class with the most-derived vtbl assigned, that assumption that the C++ code may make is not reliable.
To rephrase; D semantic allows that a C++ override may be called from a base class, which is impossible in C++, and so we figured we probably needed to defensively preserve C++ semantics throughout the ctor callstack to avoid that case, and I forgot that arbitrary D code may be executed prior to super() call. Too much C++ in the brain! :)

@TurkeyMan
Copy link
Contributor

Of course, that's a very unlikely case, and it's a value-judgement. It may be a better call to preserve D semantics such that D code works as expected. It's strongly discouraged to call virtuals from ctors in D for this exact reason... so calls to unsuspecting C++ overrides are likely to be mitigated by that recommendation.

@kinke
Copy link
Contributor Author

kinke commented Aug 4, 2018

It may be a better call to preserve D semantics such that D code works as expected. It's strongly discouraged to call virtuals from ctors in D for this exact reason... so calls to unsuspecting C++ overrides are likely to be mitigated by that recommendation.

I tend to agree, otherwise the language spec would get too hairy. It's also clear that calling the final overrides in a C++ base ctor would be a very bad idea, as the fields of more derived types are most likely not initialized yet (garbage). In an extern(C++) one implemented in D, all fields are/should at least be initialized according to the init symbol (and possibly already mutated earlier in more derived ctors, by delaying the super() call in those).
So restoring the previous vptr after an external super call should do to preserve D semantics in extern(C++) ctors.

Related to this is the question of how to deal with this init-symbol-blit dependency of the extern(C++) ctor chain. In order to make extern(C++) classes seamlessly usable in C++ (e.g., allocate+construct via C++ new without ending up with garbage due to untouched fields normally initialized by the blit), we could blit the current class' fields from the init symbol (of the current class; these initializer fields should perfectly match the most derived class' init symbol) upon ctor entry. [This blit would need to be skipped during CTFE too.]
Edit: Erm nope, doing so would leave the vptr uninitialized with C++ new. To overcome this, each ctor would need to set it on function entry too, and we'd be in the exact same boat as C++ wrt. used overrides for virtual calls...

@TurkeyMan
Copy link
Contributor

Related to this is the question of how to deal with this init-symbol-blit dependency of the extern(C++) ctor chain. In order to make extern(C++) classes seamlessly usable in C++ (e.g., allocate+construct via C++ new without ending up with garbage due to untouched fields normally initialized by the blit), we could blit the current class' fields from the init symbol (of the current class; these initializer fields should perfectly match the most derived class' init symbol) upon ctor entry.

I agree, we need to init when calling D ctor from C++... [thoughts below]

[This blit would need to be skipped during CTFE too.]

No, in CTFE, this is already current behaviour for construction of any class type. No change is required for CTFE. I don't think ANY of this C++ logic is involved in CTFE, because it's all related to external linkage, and CTFE can't resolve if there's any extern calls in the chain anyway.

Edit: Erm nope, doing so would leave the vptr uninitialized with C++ new. To overcome this, each ctor would need to set it on function entry too, and we'd be in the exact same boat as C++ wrt. used overrides for virtual calls...

So, we're dealing with the situation when an extern(C++) ctor is defined in D, and NOT in C++; that is, when C++ constructs, it will call the D function.

In D, prior to construction, init will be assigned. C++ doesn't execute such logic.

I think the proper solution is this:

  1. ctor is compiled as usual (including the logic we've detailed in this thread, where post-super() has the vptr restored), no special sauce is required.
  2. the ctor symbol is NOT mangled for C++
  3. when an extern(C++) ctor is not external (is defined in local D code), an additional __cppctor() function is generated, which does assign init, and then calls the regular ctor
  4. this __cppctor() function is mangled for C++

When C++ constructs a class, the generated __cppctor() that is mangled for C++ first performs the init assignment, which properly prepares for construction, then calls into the regular ctor call tree.

I think this works elegantly.

So there are 2 pieces of special-sauce in the end:

  1. calls to external super() must keep and restore the vptr after C++ super() returns.
  2. extern(C++) ctors that are NOT extern must have a __cppctor() shim generated, which assigns init, then calls ctor(), the proper C++ mangling is applied to __cppctor() so that C++ calls the right function

I think that resolves all the edge cases?

@TurkeyMan
Copy link
Contributor

Note: of course, D would never call this magic __cppctor(), since the init assignment is handled by the language, which means there is no interaction with CTFE in this solution.

@kinke
Copy link
Contributor Author

kinke commented Aug 4, 2018

I think that resolves all the edge cases?

// D
extern(C++) class A
{
  int a = 1;
  this() {}
  void virtualFoo() {}
}

// C++
class B : public A
{
  int b = 2;
  B() : A() {} // => calls the 'actual' C++ ctor of A, performing the A init blit
};

// D
extern(C++) class C : B
{
  int c = 3;
  this()
  {
    a = 666;
    super();
  }
}

B::B() will call the C++-mangled A base ctor, which blits vptr and a, overwriting 666, but okay, this is a very theoretical issue. The vptr will afterwards be set to the B vtable when entering the B ctor body, and finally restored to C vtable after the super call; but virtual function calls in the A ctor and further base ctors will be directed to the A type, due to potentially multiple init-symbol-blits in the ctor chain of mixed C++/D class hierarchies...

@TurkeyMan
Copy link
Contributor

Good point. There is an edge case when C++ derives from D...
Hmm, it's hard to imagine a good resolution here. I feel like we might have to resolve something with 'documentation' :/

@kinke
Copy link
Contributor Author

kinke commented Aug 4, 2018

Another issue with this:

extern(C++) class A { int a = 123; }

This would now need a C++-mangled ctor (just blitting), although it has no regular ctor.

@TurkeyMan
Copy link
Contributor

Yeah, that __cppctor() symbol I described should probably be present for every ctor implemented in D (including the default ctor or default init).

@TurkeyMan
Copy link
Contributor

I think this motivates leaning back towards extern(C++) ctors matching C++ construction semantics with vptr assignment at entry.

@kinke
Copy link
Contributor Author

kinke commented Aug 4, 2018

leaning back towards extern(C++) ctors matching C++ construction semantics with vptr assignment at entry

Yeah that's what I was thinking too, but I'd go further and get rid of the D pre-init blit for extern(C++) classes altogether (and so no need for extra __cppctor complexity). On extern(C++) ctor entry, call super() (implicitly or check if first statement) and then blit vptr & class-specific fields from init symbol; disallow later explicit super() calls; no vptr restoring necessary. And add an implicit default ctor in case there's no explicit one.
CTFE would most likely need to be adapted, but otherwise it seems reasonably sound to me and basically does what C++ does.

@TurkeyMan
Copy link
Contributor

That sounds comprehensive, do you know how to implement all that?

I'd suggest that this PR should be merged for now to restore CTFE, and that complex patch you describe can be a follow up...

@kinke
Copy link
Contributor Author

kinke commented Aug 4, 2018

do you know how to implement all that?

Of course not, but it should be doable. @rainers might want to help out too. ;)

I'd suggest that this PR should be merged for now to restore CTFE, and that complex patch you describe can be a follow up...

Yeah, we need a short-term solution, but I should add the copy and so restore the proper vptr. With this in its current state, if we have a C++ base class A and two derived extern(C++) classes B : A and C : B, the vptr is only restored after the super call in the B ctor (to the B vtable), and never restored to the C vtable...

@TurkeyMan
Copy link
Contributor

but I should add the copy and so restore the proper vptr. With this in its current state[...]

Yes, exactly.

@kinke
Copy link
Contributor Author

kinke commented Aug 5, 2018

Oh man, just when I thought I had it (works fine here with LDC on Win64), the next problem arises - just like for dtors, there appear to be multiple C++ ctors as well (C[1-3], e.g., dedicated base ctors which gcc calls in the more derived ctor, see SO). The added test was supposed to handle the edge case above (C++ class deriving from D one, which leads to the linking error) and the case of 2 D classes below a C++ one.

@TurkeyMan
Copy link
Contributor

Do you think it's wise to test/prove that most-derived vptr is present like this, since the result of our conversation above, was that we're gonna change it to strictly conform with C++ semantics anyway throughout the stack? :)

@kinke
Copy link
Contributor Author

kinke commented Aug 6, 2018

The test has shown 2 Posix issues (extra ctors & C++ classes not constructible with D new). I added a comment wrt. possibly revised semantics (then we'll simply adapt the 2 1 spot and have the test for that).

@kinke kinke force-pushed the fix19134 branch 2 times, most recently from 14c4361 to ca2d895 Compare August 6, 2018 01:29

auto restoreVptr = new AssignExp(loc, vptr.syntaxCopy(), new VarExp(loc, vptrTmpDecl));

Expression e = new CommaExp(loc, declareTmps, new CommaExp(loc, restoreVptr, new VarExp(loc, superTmpDecl)));
Copy link
Contributor Author

@kinke kinke Aug 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using Expression.combine(a, b, c, d) here, creating equivalent ((a, b), (c, d)), all hell breaks loose, requiring additional expressionSemantic() calls; combine(lhs, rhs) additionally sets the new CommaExp.type to the rhs type, which seems to make expressionSemantic() not descend to both children (which then leads to forward-referencing errors later on...).

auto pte = new DotIdExp(loc, new ThisExp(loc), Id.__vptr);
auto ate = new AssignExp(loc, pte, ase);
auto vptr = new DotIdExp(loc, new ThisExp(loc), Id.__vptr);
auto vptrTmpDecl = copyToTemp(0, "__vptrTmp", vptr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tmp names are uniqued anyway, so no need for the local superid counter (and there should be a single super() call only).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.


auto declareTmps = new CommaExp(loc, declareVptrTmp, declareSuperTmp);

auto restoreVptr = new AssignExp(loc, vptr.syntaxCopy(), new VarExp(loc, vptrTmpDecl));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if vptr.syntaxCopy() is required or whether vptr could be reused directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better safe than sorry, using the same expression instance twice might cause issues if the AST gets modified by lowerings.

kinke added a commit to kinke/ldc that referenced this pull request Aug 6, 2018
kinke added a commit to ldc-developers/dmd-testsuite that referenced this pull request Aug 6, 2018
kinke added a commit to ldc-developers/dmd-testsuite that referenced this pull request Aug 6, 2018
kinke added a commit to kinke/ldc that referenced this pull request Aug 6, 2018
sprintf(buf.ptr, "__super%d", superid++);
auto tmp = copyToTemp(0, buf.ptr, result);
// so we have to restore it, but still return 'this' from super() call:
// (auto __vptrTmp = this.__vptr, auto __superTmp = super()), (this.__vptr = __vptrTmp, __superTmp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed some comments, but is saving and restoring the vptr good enough? If the class instance is created in C++, the vptr might be garbage. Granted, other fields might not be initialized, too, but that can be solved in user code.

Can the CTFE issue be solved by preassigning some appropriate type to one of the expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's fine for now and reverts to previous D vptr semantics in extern(C++) ctors. IIRC, I didn't find anything about this breaking change mentioned in the changelog.

but that can be solved in user code.

I don't see a good way of doing that (and don't want to burden the user with these details), and so suggested #8533 (comment) for fixing this preblit-dependency properly in the long run. [The implicit default C++ ctor would be useful for extern(C++) structs too btw.]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of initializing the fields explicitly with some foreach over C.tupleof, but that is actually putting the burden on the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's fine for now and reverts to previous D vptr semantics in extern(C++) ctors.

Ok, allocating a class defined in D from C++ is still trouble anyway, and solving this for classes that have super() calls into C++-code is a very special case, only.


// TODO: Allocating + constructing a C++ class with the D GC is not
// supported on Posix. The returned pointer (probably from C++ ctor)
// seems to be an offset and not the actual object address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some platforms don't return the this pointer from the C++ constructor, see scopeAllocCpp above for a workaround.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already fixed this for LDC, by simply ignoring the returned pointer, see ldc-developers/ldc@25d7a79.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess dmd should do the same.


// TODO: Allocating + constructing a C++ class with the D GC is not
// supported on Posix. The returned pointer (probably from C++ ctor)
// seems to be an offset and not the actual object address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess dmd should do the same.


auto declareTmps = new CommaExp(loc, declareVptrTmp, declareSuperTmp);

auto restoreVptr = new AssignExp(loc, vptr.syntaxCopy(), new VarExp(loc, vptrTmpDecl));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better safe than sorry, using the same expression instance twice might cause issues if the AST gets modified by lowerings.

auto pte = new DotIdExp(loc, new ThisExp(loc), Id.__vptr);
auto ate = new AssignExp(loc, pte, ase);
auto vptr = new DotIdExp(loc, new ThisExp(loc), Id.__vptr);
auto vptrTmpDecl = copyToTemp(0, "__vptrTmp", vptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

sprintf(buf.ptr, "__super%d", superid++);
auto tmp = copyToTemp(0, buf.ptr, result);
// so we have to restore it, but still return 'this' from super() call:
// (auto __vptrTmp = this.__vptr, auto __superTmp = super()), (this.__vptr = __vptrTmp, __superTmp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of initializing the fields explicitly with some foreach over C.tupleof, but that is actually putting the burden on the user.

sprintf(buf.ptr, "__super%d", superid++);
auto tmp = copyToTemp(0, buf.ptr, result);
// so we have to restore it, but still return 'this' from super() call:
// (auto __vptrTmp = this.__vptr, auto __superTmp = super()), (this.__vptr = __vptrTmp, __superTmp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's fine for now and reverts to previous D vptr semantics in extern(C++) ctors.

Ok, allocating a class defined in D from C++ is still trouble anyway, and solving this for classes that have super() calls into C++-code is a very special case, only.

@kinke kinke force-pushed the fix19134 branch 2 times, most recently from b0dca8e to 44d70c2 Compare August 23, 2018 20:45
And revert to D vptr semantics in constructors of extern(C++) classes,
i.e., call the most derived override in base constructors. 2.081
introduced a breaking change here, using the most derived override
before the super() call and the current class' afterwards.

The proper mid-term solution is probably to emit fully C++-compatible
constructors for extern(C++) classes, calling the super ctor right at
the start and then blitting vptr and class-specific fields from the
init symbol. extern(C++) classes (and structs) should get an implicit
default ctor (only if there's no other ctor in the class case).
extern(C++) classes could then be allocated and properly constructed via
C++ `new`; struct declarations in C++ headers for extern(C++) structs
wouldn't need to duplicate the field initializers and could simply
declare the default ctor (emitted by the D compiler and blitting the
init symbol).
@kinke
Copy link
Contributor Author

kinke commented Aug 23, 2018

Rebased to retrigger the tests, squashed and commit msg added, incl. a sketch for a proper solution. - This should make it into 2.082 IMO, it's a regression fix after all.

@TurkeyMan
Copy link
Contributor

You're super awesome for wrangling this beast!! Thanks again so much!
LGTM!

@dlang-bot dlang-bot merged commit 50833f0 into dlang:stable Aug 24, 2018
@kinke kinke deleted the fix19134 branch August 24, 2018 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants