Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 20607 - Module constructors are visible as regular function #10834

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Geod24
Copy link
Member

@Geod24 Geod24 commented Feb 25, 2020

Contain two small refactorings, and an actual fix.

Side note: __traits(allMembers) is quite inconsistent: It shows the ctors / dtors, some hidden symbols like _Dmain and the object import, but not __invariant1 (when used on Bar) or __unittest_L126_C1. And honestly I can see more reasons to call the unittest function directly than to call the [shared] static [~]this...

@dlang-bot
Copy link
Contributor

dlang-bot commented Feb 25, 2020

Thanks for your pull request, @Geod24!

Bugzilla references

Auto-close Bugzilla Severity Description
20607 normal [shared] static constructor & co can be called as regular function

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#10834"

@Geod24
Copy link
Member Author

Geod24 commented Feb 25, 2020

@atilaneves : Looks like you guys rely on being able to do mixin(__traits(identifier, __traits(parent, {}))); to get the unittest. Why is that ? If you use __traits(parent, ...) you already get a reference to the symbol. I understand this is probably a synthetic use case, but interested to know if you require the ability to access the unittest using the identifier in unit-threaded (or anywhere else) ?

@@ -3733,12 +3735,14 @@ extern (C++) class StaticDtorDeclaration : FuncDeclaration

extern (D) this(const ref Loc loc, const ref Loc endloc, StorageClass stc)
{
super(loc, endloc, Identifier.generateIdWithLoc("_staticDtor", loc), STC.static_ | stc, null);
this(loc, endloc, "_staticDtor", stc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens now with multiple module constructors in the same file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing changes, because it forwards the call to the other ctor overload, which still uses generateIdWithLoc

@@ -7,6 +7,6 @@ unittest // this first unittest is needed to trigger the bug

unittest // second unittest
{
auto a = &mixin(__traits(identifier, __traits(parent, { })));
auto a = __traits(identifier, __traits(parent, { }));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't even remotely the same. Before it was a function pointer, now it's a string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have added a comment that this was just to see if the other test passed. Indeed that change shouldn't go in as-is.

@atilaneves
Copy link
Contributor

atilaneves commented Feb 25, 2020

@atilaneves : Looks like you guys rely on being able to do mixin(__traits(identifier, __traits(parent, {}))); to get the unittest. Why is that ? If you use __traits(parent, ...) you already get a reference to the symbol. I understand this is probably a synthetic use case, but interested to know if you require the ability to access the unittest using the identifier in unit-threaded (or anywhere else) ?

The reason was presented in the issue itself:

https://issues.dlang.org/show_bug.cgi?id=18097

I wrote the test so as to make sure I wouldn't break code that users had written that had previously worked when I change the unittest naming scheme. The test is now broken after this PR however.

@Geod24
Copy link
Member Author

Geod24 commented Feb 25, 2020

I wrote the test so as to make sure I wouldn't break code that users had writtern that had previously worked when I change the unittest naming scheme. The test is now broken after this PR however.

True. I was more interested in the reason one would use &mixin(__traits(identifier, __traits(parent, {}))) over __traits(parent, {}) to get a function pointer to the unittest. I would imagine that this comes from a mixin that is used / injected into unittest ? My main concern is to know if there's a use case that can only be served by using the identifier.

@JohanEngelen : Can you tell us more about the reasoning for this ? Perhaps some code example ? And would you be able to replace your reliance on identifier.

In any case, it is clear that this would break code, so can't go in as is.
I will make it a deprecation, so that no code is broken. But I'd still like to know the rationale to be sure there's a replacement.

@JohanEngelen
Copy link
Contributor

If &__traits(parent, {}) also works to get the address of the unittest function then I think we are happy. I'll need to test and report back later.

@JohanEngelen
Copy link
Contributor

If &__traits(parent, {}) also works to get the address of the unittest function then I think we are happy. I'll need to test and report back later.

Looks like it worked; unittest functions are no longer used by identifier by Weka.

@Geod24
Copy link
Member Author

Geod24 commented Mar 2, 2020

This is somehow taking longer than expected, as there's no really good way to do it.
Also found https://issues.dlang.org/show_bug.cgi?id=20626 in the process...

@Geod24 Geod24 force-pushed the ctor-visibility branch 2 times, most recently from 809f7c4 to 1289ff1 Compare March 2, 2020 19:23
@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Mar 2, 2020

FYI, I don’t think this is the only issue. I think basically all compiler generated symbols have the same issue. Another typical example is unnamed function parameters.

@Geod24
Copy link
Member Author

Geod24 commented Mar 3, 2020

Yep, it's quite a big undertaking... So I decided to restrict this PR to disabling [shared] static {c,d}tors access, because they are the only ones I can think of that actually break the type system. Need to fix dmangle.d tho.

@Geod24
Copy link
Member Author

Geod24 commented Mar 3, 2020

So apparently we have some module ctors in druntime which have extern(C) linkage, which means they end up with a very predictable and conflict-prone name in the executable:

$ nm /usr/local/lib/libphobos2.a | grep -E '__[sS].*tatic.*_L.*_C'
0000000000000784 S __sharedStaticDtor_L479_C1
00000000000002a8 S __sharedStaticDtor_L479_C1.eh
00000000000003c4 S __staticDtor_L376_C1
00000000000001a8 S __staticDtor_L376_C1.eh

Yeah that's pretty broken... I guess we should also deprecate this 🤦‍♂

@Geod24
Copy link
Member Author

Geod24 commented Mar 3, 2020

Blocked on #10858

@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Mar 3, 2020

If you want to be really thorough you should make the symbols have, what's called in LLVM (at least), private linkage [1]. But I'm not sure if that backend supports that kind of linkage.

[1] https://github.com/dlang/dmd/blob/master/docs/objective-c_abi.md#private-linkage

@Geod24
Copy link
Member Author

Geod24 commented Mar 3, 2020

My original approach was to avoid putting things in symtab for Prot.Kind.none.
However visible symbols like unit tests and invariant need to be deprecated first. The only way to do this was to do a check in search when a symbol was found, which is a rather heavy price to pay because of how often this function is used.

But then I found that some places in the compiler use the identifier to access the symbol and so would trigger the depreciation message. Not to mention the other bug with typeof.

So I will probably go to them one by one, make sure nothing uses the identifier, deprecate accessing the identifier, and then make it not accessible. Whether or not we use Prot.Kind.none or anonymous symbol is up for discussion. At the moment those symbols semantic will forward to funcDeclarationSemantic which override the linkage with the scope's linkage, so the anonymous way seemed simpler and more robust.

@Geod24 Geod24 added the Bug Fix label Mar 5, 2020
@Geod24 Geod24 force-pushed the ctor-visibility branch 2 times, most recently from 6916d5e to db23067 Compare March 6, 2020 03:20
@Geod24
Copy link
Member Author

Geod24 commented Mar 6, 2020

Alright, this fails because without an identifier, duplicate manglings are generated for childs of [shared] static [~]this. Obviously not what we want.

This is luckily very obvious, because the compiler generates a gate variable for those symbols when they are inside a template, to prevent them being called multiple time for the same type (most of the time)

Any way I have attempted to fix it, I either found bugs, special cases, or just bloat (hello static foreach). So reverted to the old symtab approach.

This breaks the type system pretty easily.
If the user *really* want to do something funky,
they can still declare those as `extern(C)` and there isn't
much we can (or will) do about it.

But having those functions available in the symtab
means they show up in `impHint` and can be called by
accident (e.g. a bug in code generation that use `__allMembers`).

There are many other instances in DMD where this happens,
however we currently rely on magic identifiers too much
to fix everything.
The test case contains a few other instances which we'll
hopefully gradually fix.
@MoonlightSentinel
Copy link
Contributor

This would fix the windows debug info generation:

diff --git a/src/dmd/tocvdebug.d b/src/dmd/tocvdebug.d
index 57aaca5e0..b9aea9622 100644
--- a/src/dmd/tocvdebug.d
+++ b/src/dmd/tocvdebug.d
@@ -73,6 +73,7 @@ uint PROTtoATTR(Prot.Kind prot) pure nothrow @safe @nogc
 
     final switch (prot)
     {
+        case Prot.Kind.none: // No explicit mapping, treat as private
         case Prot.Kind.private_:       attribute = 1;  break;
         case Prot.Kind.package_:       attribute = 2;  break;
         case Prot.Kind.protected_:     attribute = 2;  break;
@@ -80,7 +81,6 @@ uint PROTtoATTR(Prot.Kind prot) pure nothrow @safe @nogc
         case Prot.Kind.export_:        attribute = 3;  break;
 
         case Prot.Kind.undefined:
-        case Prot.Kind.none:
             //printf("prot = %d\n", prot);
             assert(0);
     }

@Geod24
Copy link
Member Author

Geod24 commented Jul 2, 2020

Ah thanks for the info! I need to put some time into this. I tried a bit last week, but as you can see from the history of comments, the issue is non-trivial... Starting with something simpler (like _value in out contracts) might be the best path at this point.

Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the issue, I think this PR is moving in the wrong direction. Introspection in D and access to compiler/runtime internals is one of D's strengths as a system programming language with strong metaprogramming. Instead, it would be better to fix the problem and only the problem, without introducing breaking changes that affect the ecosystem:

  1. Ensure that the type system breaking functionality is not accessible is @safe code
  2. Explicitly label non-function module members as not functions

Removing them from the allMembers list is the wrong thing to do. They are members like any other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants