Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Solve most spurious forward/circular referencing errors #7018

Closed
wants to merge 33 commits into from

Conversation

Syniurge
Copy link
Contributor

@Syniurge Syniurge commented Jul 24, 2017

This is the implementation of the solution discussed here. This PR:

  • splits the semantic() functions into the part where the symtab of a ScopeDsymbol gets determined, and the rest. Although instead of adding another function that part is added to importAll() since its reason of being has been if I'm not mistaken to get the ScopeDsymbol ready for search().
  • makes overrides of Dsymbol.importAll callable multiple times recursively each time resuming from where the previous ongoing call left off.
  • (todo) renames importAll to determineMembers to clarify its new role.
  • makes TemplateInstance.semantic3 call semantic2, and TemplateInstance.semantic2 call semantic if needed. Every split semantic function also calls importAll so no additional importAll call is needed alongside existing semantic calls.

However the re-ordering of determineMembers/semantic() calls broke existing code. The reason is that in some cases, deferring lookups cannot be avoided, which I didn't realize initially. So a large part of this PR is also to implement deferring (instead of error gagging). (will write more details soon).

This addresses the two issues from the forum thread:

https://issues.dlang.org/show_bug.cgi?id=17656
https://issues.dlang.org/show_bug.cgi?id=17194

and probably a lot more cases.

Even @tgehr's code now compiles!:

enum x = "enum xx = q{int y = 123;};";

struct SS{
    mixin(xx);
    mixin(x);
}

void main()
{
    import std.stdio : writeln;
    SS s; writeln(s.y); // prints 123
}

To do:

  • fix the remaining breakages
  • special aggregate methods currently generated by clone.d should be added to .members during importAll(), and their body should be generated not during Struct/ClassDeclaration.semantic() but during FuncDeclaration.semantic()

Along the way I've spotted possible further improvements.

Further enhancements:

  • Merge addMember into determineMembers?

  • Is passing sc to semantic1.2.3 really useful once setScope has been called? (Was setScope historically an afterthought?) Passing sc only seems required if the symbol:

    • is declared inside a function through a DeclarationExp
    • is inside a typeof()
    • was added to a .members without a setScope call. Only possible case as far as I know: TemplateInstance appended to the members of the instantiating or root module.

     
    The scope has no reason to change and should stay the same between importAll and semantic1,2,3. Shouldn't setScope get called in these few special cases? Ensuring that every symbol about to be semantic'd has its scope set would also avoid having to re-create one or more Scope objects during AggregateDeclaration/TemplateInstance/AttribDeclaration... .semantic2,3. (However currently _scope is also used in several semantic() functions as a safeguard against recursive semantic calls, but I don't think it should be used that way, checking semanticRun would do too.)

  • Symbols could retrieve basic info from sc during setScope(), so __traits(getProtection/Linkage/...) may work before determineMembers.

@dlang-bot
Copy link
Contributor

dlang-bot commented Jul 24, 2017

Thanks for your pull request, @Syniurge! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Auto-close Bugzilla Description
17194 [scope] Fwd reference error with nested struct
17656 Enum member circular reference error

⚠️⚠️⚠️ Warnings ⚠️⚠️⚠️

@dlang-bot dlang-bot added the WIP Work In Progress - not ready for review or pulling label Jul 24, 2017
@Syniurge Syniurge changed the title [WIP] Solve most spurious forward/circular referencing errors [WIP] Fixes issues 17656, 17194 - Solve most spurious forward/circular referencing errors Jul 24, 2017
@Syniurge Syniurge changed the title [WIP] Fixes issues 17656, 17194 - Solve most spurious forward/circular referencing errors [WIP] Solve most spurious forward/circular referencing errors Jul 24, 2017
…to avoid calling .importAll() too soon.

setScope() was being run more than once on some symbols because of it.
@rainers
Copy link
Member

rainers commented Jul 24, 2017

Not sure how it compares to your changes, but I could compile Timon code with these changes: rainers@acbe99d, though that failed a couple of other tests.

One of the major issues was that an aggregate is marked as semantically analyzed, even if a (template-)mixin had forward reference errors. Instead the analysis needs to be deferred to be tried agin later.

@Syniurge
Copy link
Contributor Author

Syniurge commented Jul 24, 2017

@rainers are you sure you don't have more code beyond this commit that for example gags errors during semantic? I built your tg_frontend branch and the error is still here:

Error: undefined identifier 'xx', did you mean variable 'x'?

which was expected because it's still going through the following: SS.semantic -> mixin(xx).semantic -> sc.search('xx') -> undefined identifier 'xx'


edit: oh nevermind sorry, I didn't realize you were talking about tgehr/d-compiler.

@rainers
Copy link
Member

rainers commented Jul 25, 2017

oh nevermind sorry, I didn't realize you were talking about tgehr/d-compiler.

Sorry, it wasn't very clear in my message. It probably won't fix every error but the approach might work for the example above, too, but should indeed gag symbol lookup errors (and others) if a scope with unresolved mixins is searched.

@Syniurge
Copy link
Contributor Author

Syniurge commented Jul 26, 2017

but should indeed gag symbol lookup errors (and others) if a scope with unresolved mixins is searched

That would definitively help in many cases. but at the cost of having to defer and re-run semantic() a lot.

The fragility of the current approach can be attributed to unnecessary symbol dependencies created by semantic() calls. By trimming down these dependencies this PR should solve all referencing errors for not too contrived code without having to defer anything. However it still can't handle a symtab entry popping up like this:

struct S {
    mixin("enum E1 = \"882255\"; enum A = \"" ~ B ~ "\";");
    mixin("enum E2 = " ~ E1 ~ ";");
    mixin("enum B = \"yyiioo\";");
}

the_unsolved.d(3): Error: undefined identifier E1

For this I don't think there's any way other than gagging lookup errors, and deferring as long as there is more than one symbol being expanded in the same ScopeDsymbol. I plan on adding an intermediate semantic pass value between PASSmembers and PASSmembersdone to tackle this, but this should probably go into another PR.

…ning STCfield into importAll(), and make AggregateDeclaration.determineFields call VarDeclaration.importAll() instead of semantic().
@Syniurge
Copy link
Contributor Author

Syniurge commented Jul 29, 2017

So I've hit a wall. The following code (reduced from druntime/import/core/sys/posix/sys/types.d):

static if (__WORDSIZE == 64)
    enum __SIZEOF_PTHREAD_BARRIER_T = 32;
else
    enum __SIZEOF_PTHREAD_BARRIER_T = 20;

enum __WORDSIZE = 64;

struct pthread_barrier_t
{
    byte[__SIZEOF_PTHREAD_BARRIER_T] __size;
    // Error: undefined identifier __SIZEOF_PTHREAD_BARRIER_T
}

fails with this PR for a similar reason the previous mixin example fails => because pthread_barrier_t.importAll() gets called while evaluating the static if condition, so before __SIZEOF_PTHREAD_BARRIER_T becomes available.

And it worked before in vanilla because StaticIfDeclaration.semantic relied on addMember and doesn't invoke importAll. But this for example cannot be built with current DMD:

static if (__WORDSIZE == 64) // Error: undefined identifier __WORDSIZE
    enum __SIZEOF_PTHREAD_BARRIER_T = 32;
else
    enum __SIZEOF_PTHREAD_BARRIER_T = 20;

static if (true) {
    enum __WORDSIZE = 64;
}

Whereas it compiles if we re-order the static if. Hence this isn't ideal either.

For this PR I initially thought that an option would be to temporarily stick to the current behavior and make search() try a first lookup without invoking importAll(). Meaning that this would remain fragile, and the order of declarations would still matter. If for example the condition relies on a template instantiation, later importAll() (or semantic() in current DMD) calls may introduce more specialized templates, so the evaluation of the condition will be silently incorrect.

But in any case, because the order of importAll()/semantic() calls has changed with the rest of the PR, the test case importing core.sys.posix.sys.types still fails with this hack.

So I guess the only option is to go for lookup error gagging and deferring, and do it straightaway in this PR. It'll take more importAll/semantic code reorganization but we'll have a robust solution instead of a half one.

@@ -64,8 +64,6 @@ extern (C++) abstract class AggregateDeclaration : ScopeDsymbol
uint structsize; // size of struct
uint alignsize; // size of struct for alignment purposes
VarDeclarations fields; // VarDeclaration fields
Sizeok sizeok; // set when structsize contains valid data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to update c++ headers for all field and function signature changes once you are done.

@@ -143,31 +143,14 @@ struct Prot
}
}

enum PASS : int
enum SemState : uint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks downstream compilers written in c++, can't we just keep the same PASS enum, just simplify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this break of a lot of GDC code?

semantic1/2/3() are getting split into smaller functions, that may be called separately. For each of these functions I need to store an independent SemState value. SemState could be renamed to PASS, but won't the rest of the changes be heavy for GDC anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PASSobj is used to determine if code has been generated for the declaration. And semantic may be requested in the declaration symbol pass.

Keeping the naming convention would be a noop for downstream, just adjusting the enum as necessary to reflect D definitions.

You mention that each semantic has an independent SemState, how can you determine from the outside which stage a given symbol is currently at?

@ibuclaw
Copy link
Member

ibuclaw commented Sep 4, 2017

You are probably also in contention with #7119 and other semantic refactorings.

@Syniurge
Copy link
Contributor Author

Syniurge commented Sep 4, 2017

Odd looking comments will be removed and line indentation cleaned up before you're done, right?

Yes this PR is not ready at all. During the past three weeks I got distracted by other stuff but I'm getting back to it.

You are probably also in contention with #7119 and other semantic refactorings.
However it looks like there will be some rebase/squashing done.

About that, @RazvanN7's refactoring of Dsymbol.semantic() having been merged (#7060), I'm not really sure how my changes should be adapted to the visitor pattern once I'll rebase to current master. Should I add one visitor per new semantic function i.e semanticInitializer, semanticBody, semanticType, determineMembers, etc.?

Just as a preference that no individual commit in a PR should leave the compiler broken.

I'll try avoiding that when rebasing but this will probably result in huge commits since it's a redesign of semantic()?

@RazvanN7
Copy link
Contributor

RazvanN7 commented Sep 4, 2017

About that, @RazvanN7's refactoring of Dsymbol.semantic() having been merged (#7060), I'm not really sure how my changes should be adapted to the visitor pattern once I'll rebase to current master. Should I add one visitor per new semantic function i.e semanticInitializer, semanticBody, semanticType, determineMembers, etc.?

Ideally, if you want to split the semantic methods, you don't have to create visitors for them, you could just add them as private methods inside the existing visitor. We can talk on specific examples if you want to.

@JinShil
Copy link
Contributor

JinShil commented Nov 26, 2017

May also address https://issues.dlang.org/show_bug.cgi?id=17267

@Syniurge Do you still plan on moving this forward?

@ibuclaw
Copy link
Member

ibuclaw commented Nov 26, 2017

The best way to take this forward is probably one commit at a time, split across many PRs so that each iteration has been checked and doesn't break anything.

Having this mega patch series kept open is fine in the meantime if the finished result works.

Has there been any discussion with @WalterBright about the intended design? There's been some recent refactorings being done in the semantic passes already, notably to remove the reliance on the scope variable to determine the semantic state.

@ibuclaw
Copy link
Member

ibuclaw commented Jan 5, 2018

I think I might grit my teeth here and say this is absolutely ridiculous and should be closed.

From a brief look at the individual commits, I don't think there's anything that could be salvaged from the scuppered wrecks.

Changes in this regard should be kept to a minimal.

Each commit should be buildable and pass the testsuite, even if that means setting up extra scaffolding with the intent to eventually take it down.

@wilzbach
Copy link
Member

wilzbach commented Jan 9, 2018

Changes in this regard should be kept to a minimal.

Each commit should be buildable and pass the testsuite, even if that means setting up extra scaffolding with the intent to eventually take it down.

@Syniurge what Ian tries to say is that you should submit small PRs that can be reviewed, checked and merged individually. +3,430 −4,352 is just too much for a single PR. Tiny steps are the road to victory, not "Large diffs are not rendered by default."

I will close this now, but please don't see this as a rejection in general. You can always resubmit this, but in small reviewable PRs with as little churn (unrelated changes) as possible.

@wilzbach wilzbach closed this Jan 9, 2018
@ibuclaw ibuclaw removed their assignment Jan 10, 2018
@Syniurge
Copy link
Contributor Author

No problem, I thought I'd have the time to work on this, rebase everything to clean the commits and catch up with master, but that fell short.

I'll be back in a few days to restart working on this, from scratch (current master). This time I'll try to split it into smaller PRs, by coming up with something that can fallback to semantic/semantic2/semantic3 so the changes are gradual instead of one big all-or-nothing reorganization.

@ibuclaw
Copy link
Member

ibuclaw commented Jan 10, 2018

Thanks 👍

Much appreciated

@ibuclaw
Copy link
Member

ibuclaw commented Jan 10, 2018

Also bear in mind that there are a couple other patches that seek to reduce complexity made by Kenji that I will give a look over later.

They very likely conflict with your changes to design also. Or they might just make some parts simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix Needs Rebase Needs Work WIP Work In Progress - not ready for review or pulling
Projects
None yet
7 participants