Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement visitor for dsymbol and type (need for #2074) #2356

Closed
wants to merge 1 commit into from

Conversation

IgorStepanov
Copy link
Contributor

Please pull it faster if possible, because it have many dependences.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 17, 2013

Not needed for #2074 (as far as I can see none of that pull uses this code). But a baseline for C++<->D in DDMD @yebblies ?

@WalterBright
Copy link
Member

dmd already uses a different style of visitor pattern, see apply.c and sapply.c. I think we should stick to the same pattern.

@don-clugston-sociomantic
Copy link
Contributor

As far as I can tell, there are no non-trivial functions in this. Unlike expression and statement visitors, which have dozens of special cases. So there's a mountain of code here, but as far as I can tell, it isn't adding any real value?

@yebblies
Copy link
Member

@WalterBright That style has problems, as it requires casting and only allows one pattern of iteration. We need something flexible enough to implement any kind of visitor.

I'm hoping to replace the various visitor schemes with one, and this would be my preference. toDt, toCtype, toCBuffer, toElem, mangling etc can all be done with the proper visitor pattern without too much pain. This would allow the 'visit' functions to be removed from the class declarations, removing the need for backend types to be referenced at all in the ast headers. This also allows adding new passes without modifying the ast classes.

@donc There is no actual usage in here, it's just the base visitor.

I discussed this with Iain at dconf and it will help the other backends - they can add their own passes without modifying the frontend. It will also allow us to easily keep the one-pass-per-file organisation of dmd when it's ported to D.

I have used this pattern/implementation before in compiler frontends and have found it works very well.

@don-clugston-sociomantic
Copy link
Contributor

@yebblies I know this is only the base class, but it looks as though it is a no-op, there is no traversal involved, so there is actually no visiting. No visitor class is required. It's really just a virtual function call.

The few hundred lines of boilerplate code rings some loud alarm bells. I know it's what GoF does, but GoF has been heavily criticized for that code. From what I saw in a brief look at LLVM, they use a visitor like the one in apply.c.

Note that I also have a statement visitor in CTFE, which is fairly well tested now. Unfortunately I found that Walter's statement visitor was just not good enough. The GoF approach makes more sense there I think, because every kind of statement is very different, -- none of it is boilerplate.

@yebblies
Copy link
Member

@donc
I understand that this is overkill for visiting declarations, but my hope is to use the same pattern for everything: declarations, expressions, statements and types. Being consistent has some value, which I think offsets the overkill.

@andralex
Copy link
Member

This is a naive rendering of the GoF pattern; much better can be done in C++ with the Acyclic Visitor pattern that uses dynamic_cast (http://www.objectmentor.com/resources/articles/acv.pdf; generic implementation at http://loki-lib.sourceforge.net/html/a00685.html). Macros could be added to further reduce boilerplate, but that would need to get synchronized with @yebblies. One possibility is to just wait until we port to D (@yebblies any ETA?) and then implement visitor adequately.

I'll overstep my boundaries and close this. There is no way we should accept this ungodly amount of boilerplate and duplication.

@andralex andralex closed this Jul 18, 2013
@ibuclaw
Copy link
Member

ibuclaw commented Jul 18, 2013

@andralex not sure how yebblie's work is getting along, though I am about halfway done on cleaning up gdc-specific changes in the frontend. :o)

@IgorStepanov
Copy link
Contributor Author

@andralex Visitor need for C++ mangling, C++ mangling need for D port, D port need for Visitor. Catch 22? =)
Another trouble: this visitor will be used to communicate D and C++ parts of compilers and we should use common subset of C++ and D in visitor implementation. Does D allow mapping C++ classes with multiply inheritance (if classes without state as D interfaces).
I was thought about Acyclic Visitor or/and some generic solution when I generated tons of similar code, but this solutions used deep C++ features which can't mapped to D. Let's wait @yebblies word, but I suggest apply this naive implementation, and after moving to D refactor It with modern D techniques of generic programming (I think, portion of mixins will save us).

@andralex
Copy link
Member

The way I see it we plan to bring the dog to eat the cat so then we kill the dog. Something is off about this. We just should not bring this kind of duplication by the wheelbarrow.

I also think you're dismissing Acyclic Visitor in D rather hastily. The only C++ feature used in it is dynamic_cast, which D does have.

@IgorStepanov
Copy link
Contributor Author

The only C++ feature used in it is dynamic_cast, which D does have.

In Acyclic Visitor dynamic_cast used with multiply inheritance.

This is C++ code:

class Visitor
{
    virtual ~Visitor(){}
};

class TypeFunctionVisitor
{
    virtual void visit(TypeFunction&) = 0;
};

class TypeArrayVisitor
{
    virtual void visit(TypeArray&) = 0;
};

class MyMangler: public Visitor, public TypeFunctionVisitor, public TypeArrayVisitor
{
   void visit(TypeFunction&);
   void visit(TypeArray&);
};

D code, which accept visitor

//interface to C++
extern(C++)
interface Visitor
{
};
extern(C++)
interface TypeFunctionVisitor
{
    void visit(TypeFunction&);
};
extern(C++)
interface TypeArrayVisitor
{
    void visit(TypeArray&);
};

//D code
class TypeFunction: Type
{
   //We pass here instance of MyMangler:
   void acceptVisitor(Visitor v)
   {
       if(auto fv = cast(TypeFunctionVisitor)v) //HOW?
       {
           fv.visit(this);
       }
       else
       {
           //fail
       }
   } 
}

Is D cast know about C++ class layout and does cast can perform casting over C++ class hierarchy? And if answer - no, can we implement this functionality? Is it possible? What restrictions?

@ibuclaw
Copy link
Member

ibuclaw commented Jul 18, 2013

@andralex Visitor need for C++ mangling, C++ mangling need for D port, D port need for Visitor. Catch 22? =)

You don't need a visitor for C++ mangling, C++ mangling is done just fine as it is in cppmangle.c (ergo, there is no catch here). Though it looks like you would like one, so what benefits would this bring?

@IgorStepanov
Copy link
Contributor Author

You don't need a visitor for C++ mangling, C++ mangling is done just fine as it is in cppmangle.c. (Thus, there is no catch here). Though it looks like you would like one, so what benefits would this bring?

Ok, how you suggest to add visual c++ mangling? Add another one method to all Types? Like toCPPVSMangle?. Yes, we can create mangler without visitor, but this solution will be matters worse than this not best visitor.

@yebblies
Copy link
Member

I also think you're dismissing Acyclic Visitor in D rather hastily. The only C++ feature used in it is dynamic_cast, which D does have.

dynamic_cast won't work across the boundary between D and C++, not ever. This is the most important place too, so we can implement the glue layer(s) without modifying the frontend. I'm not seeing how acyclic visitor is going to reduce boilerplate either, it looks like it will significantly increase it with all those new visitor classes.

Macros are doable if they are not mixed in with other code (then the whole module can be replaced) or if they can be automatically translated to function calls.

eg. if visitor.c consisted of

VISIT(Type)
VISIT(TypeBasic)
VISIT(TypeNext)

I can replace the whole file with a mixin-based d version.
What does not work so well is requiring macro usage in the method bodies where the actual code lives.

One possibility is to just wait until we port to D (@yebblies any ETA?)

If I can't get some kind of movement on #2215, never.

On the other hand, the first 71 compilable tests are now passing. (with the backend stubbed out)

@WalterBright
Copy link
Member

There are really only two manglers necessary - Windows and Unix. There may be minor differences between 32 and 64 bit mangling on Windows, but that shouldn't merit a whole new scheme.

@IgorStepanov
Copy link
Contributor Author

There are really only two manglers necessary - Windows and Unix. There may be minor differences between 32 and 64 bit mangling on Windows, but that shouldn't merit a whole new scheme.

@WalterBright Yes I can implement mangling without visitor, but visitor anyway need for us and implementation with visitor is more clear and understandable.
BTW: I've asked on forum but you are not answer me: why dmc and visual c++ mangling of templates are so different? It is expectable or it is bug?

@andralex Please continue discussion:) Non-relative with mangling: visitor need for ddmd port and we should make it. How we should it is a subject of our discussion.
At first lets define requirements for visitor:
Visited hierarchy will be implemented in D (all those Type, Dsymbol, Expression et c.)
Concrete visitors can be implemented on D and on C++: visitor is planned as main mechanism of C++ <-> D communication but it can be used from D to implement some semantic analysis mechanism.

Now about implementation. You suggest implement Acyclic Visitor. It implementation uses dynamic casting between types which relation unknown at compile time. Eg:

interface A{}
interface B{}

A getA()
{
   class C: A, B{}
   return new C;
}

B objB = cast(B)getA();

Compiler replace this casting with special druntime call.
This druntime call cast interface A to Object, and use .classinfo of this object to check availability of casting to B. For more info see rt/cast_.d in druntime. This code use d-specific class: TypeInfo_Class which generated for each D class or interface.
However C++ classes have not classinfo. C++ have C++-specific typeinfo and C++ dynamic_cast use it for perform this casting. Now question for c++-guru: is C++ typeinfo is standardised mechanism which can be used from druntime (may be with some D compiler changes). If answer is "now" - we must forget about Acyclic Visitor. However in this case we can improve readability of code with using macros instead copy-paste.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 19, 2013

@andralex Please continue discussion:) Non-relative with mangling: visitor need for ddmd port and we should make it. How we should it is a subject of our discussion.
At first lets define requirements for visitor:
Visited hierarchy will be implemented in D (all those Type, Dsymbol, Expression et c.)
Concrete visitors can be implemented on D and on C++: visitor is planned as main mechanism of C++ <-> D communication but it can be used from D to implement some semantic analysis mechanism.

Why is a visitor required for ddmd? We (myself, andrei and yebblies) already discussed this at dconf, I'm just making sure that you understand it's purpose.

Now about implementation. You suggest implement Acyclic Visitor. It implementation uses dynamic casting between types which relation unknown at compile time. Eg:

These'll be extern(C++) interfaces/classes... assuming that we need extern(C++) class support...

@IgorStepanov
Copy link
Contributor Author

Why is a visitor required for ddmd? We (myself, andrei and yebblies) already discussed this at dconf, I'm just making sure that you understand it's purpose.

First task what I see - implementing of back-end independence writing to obj file. Now:

class Dsymbol
{
...
#ifdef IN_DMD
   void toDt(dt_t**);
#endif
#ifdef IN_LDC
   //something else
#endif
}

Need:

class Dsymbol
{
...
void acceptVisitor(Visitor*);
}

Back-end will pass visitors for all his task and front-end will be identically for all back-ends.
Am I right?:)

These'll be extern(C++) interfaces/classes... assuming that we need extern(C++) class support...

...And extern(C++) std.typeinfo for dynamic cast. I have no idea how do it properly now.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 19, 2013

Not exactly, partly because LDC's situation is a lot worse than what you describe, eg: Some classes have an added semantic3 pass, or extra members, which is all part of front-end, not back-end stuff.

Ignoring the hundreds of lines of frontend changes in source files done by gdc/ldc, and pretending that all changes in the headers really are for the backend, then this would be true.

Back-end will pass visitors for all his task and front-end will be identically for all back-ends.

Visitors are not there to solve all problems. Just a suggested common layer for dmd/gdc's toDt/toSymbol/toIR/toElem/toObjFile/toCtype/toCParamtype vs ldc's codegen/llvmDefine/toElem/toConstElem/toDt/toIR/toNakedIR.

Though I don't suppose anyone has really sat down and discussed what each party should do other than the brief discussion: "Hey, I'm working on getting code convergence between gdc/dmd. I'm wanting ldc guys to get on board so we can try to benefit each other in the process. One thing that will need to be addressed eventually are that gdc/dmd uses different types for backend/code generation routines" - "We could use a visitor interface to address that."

@IgorStepanov
Copy link
Contributor Author

@ibuclaw And your verdict: Should we implement visitor now or not? When I starting do #2074 I didn't plan to implement common visitor. I lowly call acceptVisitor as mangleX. After it @yebblies told that create and using in common visitor would be good.
If we need visitor now - lets implement it. goto discussion; :)

@yebblies
Copy link
Member

Yeah, we don't need this for mangling, but we do need this, and soon, if we want to have a unified frontend.

'apply' doesn't cut it, because you can't control the iteration order, and it is not typesafe.

Anything that needs dynamic_cast is out, that will not work across the language boundary.

Is it verbose? Yeah, it's pretty awful. Will we be able to replace the dozen visitor interafaces in dmd with one single interface!?! I think this will be a net win.

@WalterBright
Copy link
Member

why dmc and visual c++ mangling of templates are so different?

Because I implemented templates before Microsoft did.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 20, 2013

why dmc and visual c++ mangling of templates are so different?

Because I implemented templates before Microsoft did.

I would have thought that template mangling doesn't matter for C++ because they always end up in the executable, rather than say referencing a template from a shared library - this is of course assuming you follow good practices and put templates in headers... otherwise you'll run into other problems that don't relate at all to mangling.

@IgorStepanov
Copy link
Contributor Author

Because I implemented templates before Microsoft did++

Ah. Ok=)

I would have thought that template mangling doesn't matter for C++ because they always end up in the executable, rather than say referencing a template from a shared library - this is of course assuming you follow good practices and put templates in headers... otherwise you'll run into other problems that don't relate at all to mangling.

You can define template struct in C++, and port this template in D.
After that you can pass instances of this struct to extern(C++) functions. Typical example: QtD and Qt containers.
Some qt functions and methods can take QVector parameter. You can implement QVector in D (body of QVector is accessibly in header file). If D version have same layout with C++ version you can pass this object to C++ and receive it from C++. Of course, maintaining of consistency is a your (developers of QtD in my example) trouble.

Is it verbose? Yeah, it's pretty awful. Will we be able to replace the dozen visitor interafaces in dmd with one single interface!?! I think this will be a net win.

And what I need to do now? Do I need to modify this implementation of visitor or I should reimplement using some other pattern?

@don-clugston-sociomantic
Copy link
Contributor

Visitors are not there to solve all problems. Just a suggested common layer for dmd/gdc's toDt/toSymbol/toIR/toElem/toObjFile/toCtype/toCParamtype vs ldc's codegen/llvmDefine/toElem/toConstElem/toDt/toIR/toNakedIR.

What are the function signatures for those guys? Are they fundamentally different, or are the differences mostly superficial?

I just want to mention again that CTFE can be considered to be a kind of backend. It walks the tree in the same way as the DMD/GDC/LDC backends, and so it is a potential customer of any form of 'visitor' interface.

@ibuclaw
Copy link
Member

ibuclaw commented Jul 22, 2013

For GDC, it is all superficial, eg:

#ifdef IN_GCC
typedef union tree_node elem;
#else
struct elem;
#endif

elem *FuncExp::toElem(IRState *irs);

@yebblies
Copy link
Member

I'm reopening this because:

  • DDMD now passes most of the test suite, compiles druntime and phobos
  • We need a way to split the implementation across multiple files (that works in D)
  • This solution works in both C++ and D
  • It is usable across language boundaries

Acyclic visitor, along with anything else that requires dynamic cast, is out. Part of the compiler must be implemented in C++, and part in D, which means we cannot use D classes and D's rtti, or C++'s rtti.

@yebblies yebblies reopened this Jul 27, 2013
@AlexeyProkhin
Copy link
Contributor

My stab at implementing a type visitor that does not require acceptVisitor methods: https://gist.github.com/AlexeyProkhin/7bff68f700970e7280fb .

@yebblies
Copy link
Member

@AlexeyProkhin

https://gist.github.com/AlexeyProkhin/7bff68f700970e7280fb .

That feels wrong to me, but I can't think of any reason it wouldn't do the job.

@WalterBright @andralex Would this be acceptable?

@yebblies
Copy link
Member

@WalterBright @andralex Ping.

@IgorStepanov
Copy link
Contributor Author

Ping. Is this pull acceptable after all arguments? (I tell about general idea, not about minor details like replacing common code parts with macro).

https://gist.github.com/AlexeyProkhin/7bff68f700970e7280fb .

I dont see significant advantages of this way, but it has some disadvantages (like low performance and difficult generalization to other hierarhies: dsymbols, expressions et c.)

@MartinNowak
Copy link
Member

Please let's move forward with this, after all I'd like to see a D DMD at the next DConf.
Why do we need this? From my understanding the main reason is that DMD's organization of concern (semantic, toIR, toCBuffer) in different files can't currently be modeled in D.
We recently had the discussion about outlining methods DIP47 which would address this problem, but it wasn't too well received and might not come along soon.
Using the visitor patter would allow us to refactor the passes into separate classes.
It's an appealing solution because it simplifies to reuse DMD's AST for other purposes (GDC, LDC or other D tools).
The acyclic visitor pattern is clearly a nicer implementation because each visitor interface is declared next to the the visitee thus there is no import everything module.
@yebblies you raised the issue that dynamic cast probably wouldn't work across C++/D classes.
I'm not sure where this is needed, couldn't we simply wrap the C++ visitors in a D visitr or let the D visitor call some C++ functions?
Another issue is the amount/cost of traversal. I don't know how big the overhead is currently, but going from a single virtual call to double dispatch to dispatch + dynamic cast, which is O(N) in the number of base classes/interfaces, certainly takes more time.

@yebblies
Copy link
Member

@dawgfoto Dynamic cast is out completely. All classes involved will be extern(C++) D classes, so they won't have any D OR C++ rtti available. We can use the 'internal rtti' (Expression::op, Type::ty, etc) but this does not work for every set of ast classes...

Is double dispatch going to hurt performance that much? Is using the classic visitor pattern going to hurt maintainability? I really don't think so.

At this point we just need to move forward somehow. I'm open to anything that gets us going again and unblocks DDMD.

@MartinNowak
Copy link
Member

@dawgfoto Dynamic cast is out completely. All classes involved will be extern(C++) D classes, so they won't have any D OR C++ rtti available.

That's easy, define a virtual function in the Vistor interface that returns the needed ClassInfo and or additional metadata. Then any concrete visitor has to generate this info and you can use it to implement dynamic casting.

It's a bit convoluted but this is how far I got in 188 D LOC.
http://dpaste.dzfl.pl/1b61584c, https://gist.github.com/dawgfoto/6729906

@ibuclaw
Copy link
Member

ibuclaw commented Sep 27, 2013

Different calling conventions between D and C++? :)

@yebblies
Copy link
Member

... you're mixing in a virtual function into every single class... why not just mix in the 'accept' method and be done with it? Am I missing something?

@IgorStepanov
Copy link
Contributor Author

@dawgfoto Thanks for yout attention. About your example: What I need to do if I want to implement visitor in C++? Please show example with c++ code. And second question: why cyclic visitor is bad in our case? Hierarhy of compiler classes can't be increased too much and now recompile all sources after change in host hierarhy is not big cost IMO.

@ibuclaw
Copy link
Member

ibuclaw commented Sep 28, 2013

@IgorStepanov - I'm not sure anyone is asking you to do anything at the moment... yet.

@dawgfoto - since you added a new classinfo flag isCPPClass, maybe could roll that somehow into druntime's dynamic cast...

@MartinNowak
Copy link
Member

Different calling conventions between D and C++? :)

Well the order of arguments differs so you can't simply put one for the other in a vtable.

And second question: why cyclic visitor is bad in our case?

In the acyclic visitor pattern every visitee needs to know only two interfaces. The abstract visitor (VisitorBase) and the concrete visitor that belong to the visitee (Visitor!Visitee). The dynamic cross cast allows to get from the abstract to the concrete visitor without knowing anything else about the visitor. It's self contained and easily maintainable.

However in the GoF visitor pattern the visitor defines one method for every possible visitee, i.e. the visitor know them all. But at the same time every visitee needs to know the visitor ergo it also knows every other visitee. You can mitigate this with forward declarations and hidden implementations but if you'd do that with normal D module imports you arrive at whole program compilation. Another nasty thing, the visitor is kind of a central registry. You probably know these kind of codebases where you need to remember to put your class in some weird list in an unrelated file to make things work. That alone is an annoyance but leaking all visitees is a no-no.

... you're mixing in a virtual function into every single class... why not just mix in the 'accept' method and be done with it? Am I missing something?

I'm not sure I get what you're saying. Every visitee needs to override/implement the accept method. So far not surprising. Then a visitor needs to implement a visit method for every visitee he wants to visit, this is done by inheriting from a concrete visitor interface for each visitee and implementing it's visit method.

interface VisitorBase {}
interface Visitor(T) { void visit(T); }

class A : Visitable
{
  override void accept(VisitorBase v) { if(auto va = cast(Visitor!A)v) va.visit(this); }
}

class B : Visitable
{
  override void accept(VisitorBase v) { if(auto vb = cast(Visitor!B)v) vb.visit(this); } 
}

// dynamic cross casting from VisitorBase to Visitor!A or Visitor!B makes it work
class MyVisitor : VisitorBase, Visitor!A, Visitor!B
{
  void visit(A) {} // Visitor!A.visit
  void visit(B) {} // Visitor!B.visit
}

@dawgfoto - since you added a new classinfo flag isCPPClass, maybe could roll that somehow into druntime's dynamic cast...

From my understanding the problem is that we lack a path from extern(C++) classes to their ClassInfos.
And because this could be real C++ classes we can't use vtbl[-1], vtbl[-2] might work though ;).

@yebblies
Copy link
Member

@dawgfoto I don't see what that actually buys us over GoF visitor. And if it uses dynamic cast or MI, it will be very hard to get it working over the language boundary.

@WalterBright @andralex We need to do something if we want frontend unification. How can this be changed to make it acceptable?

@MartinNowak
Copy link
Member

I guess this can be closed now.
#2754

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants