Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go target - Interfaces instead of ptr structs - Listeners and Visitors #1843

Open
millergarym opened this issue Apr 27, 2017 · 13 comments
Open

Comments

@millergarym
Copy link
Contributor

Hey Antlr/Golang people,
(@cyberfox, @pboyer, @parrt, @willfaught, @sharwell)

I'm posting this to propose and discussion a breaking change to Listener of the Go target.
It seems opportune, based on @cyberfox's work I have a PR on a visitor implementation.
Based on this I would like to propose changes to the Listener interfaces, using interfaces instead of pointer to structures.

eg

func (s *MyListener) EnterStart(ctx parser.IStartContext) {}

instead of

func (s *MyListener) EnterStart(ctx *parser.StartContext) {}

As noted on a number of occasions, the Antlr Go code is not idiomatic Go.
The major manifestations of this is large verses small interfaces, and trying to fit Java/C# "OO" into Go. To explain the "OO" comment, Go isn't objected-oriented in the same sense as Java; no implementation inheritance, only interface sub-typing, interface sub-typing is 'structural' vs Java's 'nominal'. Using interfaces and specifically small interfaces are used instead of "OO".

A specific place the current Go runtime is a problem is in re-using Listeners and Visitors. Imagine you have two grammars (A and B) that are very similar. B is a copy of A with one added and one modified rule. Currently it is not possible to implement a BListener that delegates to AListener (without making new A-based structures for every call). Using interfaces the BListener just needs to do a type assertion (cast) in order to delegate to AListener methods.

One of the issues swapping to interface is that the current generated parser does not create interfaces for Alt contexts, only for rule contexts. Implementing Alt context interfaces required changes to StructDecl.java & ListenerFile.java.

The changes proposed are in
https://github.com/wxio/antlr4/tree/visit-runtimeimport-tokenstream
#1841

An example using these can be seen at
https://github.com/wxio/antlr4-go-examples/tree/interface-discussion/scratch
note: the base listener and base visitor are now empty, only contains an example implementations in comments.

Currently the A - B use-case from above only works for named Alt, but it can be made to work for un-alt-named rules.

Cheers
Gary

@cyberfox
Copy link

I have some concerns about the interface approach, mainly around the inability to implement only a subset of the visitor endpoints, for interpreter development.

I also think that the visitors returning interface{} is a very valuable aspect, and should be retained, as it simplifies the common case for expression interpretation.

The methods I'd added for ShouldVisitNextChild and AggregateResult were to provide feature parity with the Java code, as I can see those being useful in larger scale parsers. (Actually, I may use AggregateResult myself when handling comma-separated parameters to a method in the language I'm working on.) The type switch was a vastly better way of doing it than my cascading tests, though, thank you for that.

Isn't there a concern with the bParser.(A) cast approach that the aParser will call down into a child visitor method that bParser modified, but because Go doesn't support that, it'll call the A version instead of the B version? That was the problem that I attempted to work around with the SetSuper method (which is an awful name, I concede :) ).

@millergarym
Copy link
Contributor Author

@cyberfox

I have some concerns about the interface approach, mainly around the inability to implement only a subset of the visitor endpoints, for interpreter development.

You absolutely only have to implement a subset. The generated parser code does a type check before calling the listener or visitor. In the case of the visitor, VisitChildren is called if there is not implementation. the var _ TListener = &MyListener{} is not mandatory and is only used if you want the compiler to check that an interface is implemented. With small interfaces you could have a whole lot of var _ XContextVisttor = &MyListener{} for type checking.

I also think that the visitors returning interface{} is a very valuable aspect, and should be retained, as it simplifies the common case for expression interpretation.
... ShouldVisitNextChild and AggregateResult ...

The two methods removed are DefaultResult and AggregateResult. ShouldVisitNextChild is still there.

Can you give an example of of how default and aggregate work. I couldn't see how that could be put together to do anything. It is far easier to carry state around in the visitor struct. The delegator (vs "super") has the advantage of being able to swap the visitor during the iterator. This can be seen in the A-B example. If you can provide and example using the returned interface{}, I can see if it is awkward with my code.

Isn't there a concern with the bParser.(A) cast approach that the aParser will call down into a child visitor method that bParser modified, but because Go doesn't support that, it'll call the A version instead of the B version?

This is why my VisitChildren has an extra argument delegator.

That was the problem that I attempted to work around with the SetSuper method (which is an awful name, I concede :)

Yes, in Java land your "super" is actually the "child" class (the this implicit receiver), not the super. Actually implicitReceiver would be a much better name than super. Also without setsuper your code wouldn't do what was expected.

Have a look at https://github.com/wxio/antlr4/blob/visit-runtimeimport-tokenstream/doc/go-target.md the example should make things clear.

@millergarym
Copy link
Contributor Author

@cyberfox

Moving here from #1841

What should the Go interfaces look like?

Goals:

  • allow for moving all grammar processing into Visitors and Listeners
  • including complex processing. eg
    • lazy eval
    • arguments and returns semantics
    • context sensitive processing

The following discussion only relates to Visitors, but also applies to Listeners.

example grammar

start : a[false];

a[bool recursive] returns[Atom result]
  : b[recursive] { $result = ... }
  | ...
;

b[bool recursive] returns[Atom result] 
  : .. a[false] ...
  | c[true] c[false]*
;

c[bool First]
  : ...
;

Proposal 1 - Extra args

type ParseTree interface {
	SyntaxTree

	Accept(visitor ParseTreeVisitor, currentState interface{}) (result interface{})
	GetText() string

	ToStringTree([]string, Recognizer) string
}

type ParseTreeVisitor interface {
	VisitNext(next Tree, currentState interface{}) bool
	VisitTerminal(node TerminalNode, currentState interface{})  (result interface{})
	VisitErrorNode(node ErrorNode, currentState interface{})  (result interface{})
	VisitChildren(node RuleNode, delegate ParseTreeVisitor, currentState interface{})  (result interface{})
}

Proposal 2 - State in all tree nodes

type Tree interface {
	GetParent() Tree
	SetParent(Tree)
	GetPayload() interface{}
	GetChild(i int) Tree
	GetChildCount() int
	GetChildren() []Tree

       SetState(interface{})
       GetState()interface{}
}

type ParseTree interface {
	SyntaxTree

	Accept(visitor ParseTreeVisitor) (result interface{})
	GetText() string

	ToStringTree([]string, Recognizer) string
}

type ParseTreeVisitor interface {
	VisitNext(next Tree) bool
	VisitTerminal(node TerminalNode)  (result interface{})
	VisitErrorNode(node ErrorNode)  (result interface{})
	VisitChildren(node RuleNode, delegate ParseTreeVisitor)  (result interface{})
}

Proposal 3 - functional / var-args

Something more from functional programming? But what?
Maybe a hint can come from ...
https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis
https://commandcenter.blogspot.com.au/2014/01/self-referential-functions-and-design.html

Proposal 4 - Tree and Node

type Tree interface { .. }
type Node interface { 
       SetState(interface{})
       GetState()interface{}
}
type TreeNode interface {
   Tree
   Node
}
type ParseTree interface {
	SyntaxTree
        Node

	Accept(visitor ParseTreeVisitor) interface{}
	GetText() string

	ToStringTree([]string, Recognizer) string
}
type ParseTreeVisitor interface {
        Node

	VisitNext(next TreeNode) bool
	VisitTerminal(node TerminalNode) interface{}
	VisitErrorNode(node ErrorNode) interface{}
	VisitChildren(node RuleNode, delegate ParseTreeVisitor) interface{}
}

From PR #1807

// A complete Visitor for a parse tree produced by <file.parserName>.
BaseVisitorFile(file, header, namedActions) ::= <<
<fileHeader(file.grammarFileName, file.ANTLRVersion)>

<if(file.genPackage)>
package <file.genPackage> // <file.grammarName>
<else>
package parser // <file.grammarName>
<endif>

import "github.com/antlr/antlr4/runtime/Go/antlr"

type Base<file.grammarName>Visitor struct {
	*antlr.BaseParseTreeVisitor
	super <file.grammarName>Visitor
}

func (v *Base<file.grammarName>Visitor) SetSuper(newSuper <file.grammarName>Visitor) {
	v.super = newSuper
}

func (v *Base<file.grammarName>Visitor) DefaultResult() interface{} {
	return nil
}

func (v *Base<file.grammarName>Visitor) ShouldVisitNextChild(node antlr.RuleNode, resultSoFar interface{}) bool {
	return true
}

func (v *Base<file.grammarName>Visitor) AggregateResult(resultSoFar, childResult interface{}) interface{} {
	return childResult
}

func (v *Base<file.grammarName>Visitor) VisitChildren(node antlr.RuleNode) interface{} {

type Tree interface {
	GetParent() Tree
	SetParent(Tree)
	GetPayload() interface{}
	GetChild(i int) Tree
	GetChildCount() int
	GetChildren() []Tree
}

type SyntaxTree interface {
	Tree

	GetSourceInterval() *Interval
}

type ParseTree interface {
	SyntaxTree

	Accept(Visitor ParseTreeVisitor) interface{}
	GetText() string

	ToStringTree([]string, Recognizer) string
}

type RuleNode interface {
	ParseTree

	GetRuleContext() RuleContext
	GetBaseRuleContext() *BaseRuleContext
}

type TerminalNode interface {
	ParseTree

	GetSymbol() Token
}

type ErrorNode interface {
	TerminalNode

	errorNode()
}

type ParseTreeVisitor interface {
	Visit(tree ParseTree) interface{}
	VisitChildren(node RuleNode) interface{}
	VisitTerminal(node TerminalNode) interface{}
	VisitErrorNode(node ErrorNode) interface{}
}

type ParseTreeListener interface {
	VisitTerminal(node TerminalNode)
	VisitErrorNode(node ErrorNode)
	EnterEveryRule(ctx ParserRuleContext)
	ExitEveryRule(ctx ParserRuleContext)
}

@willfaught
Copy link
Contributor

One thing to keep in mind is that some users have experienced bad performance. Since this would use an interface where before it was just a pointer, it'd be great if we could verify with some benchmarks that there's no performance regression.

@cyberfox
Copy link

I've posted up the Expressions repo which gives an example of the use of partial implementation and return values, as well as a language-independent grammar file, and a sample (trivial) input.

I extracted it from working code, albeit with a few small tweaks. The original's evaluator has a resolver for variable lookup as well, but I didn't find it necessary for the example. I can add it back in as an example of (immutable during evaluation, in my case) data associated with the ExprVisitor if it's helpful.

(It's has the parser/lexer/visitor checked in, and is still using the awful type checking cascading ifs, but that's because I haven't applied the type switch to my repo yet.)

@millergarym
Copy link
Contributor Author

@willfaught me experiences optimising the Go code, specifically the antlr target indicates that interfaces really aren't a problem. If anything they are a solution, as interfaces are polymorphic, but structs aren't.

In the case of the Antlr target, the 12x improvement with the lexer and parser basically falls into two categories. 6x was a Java static being coded as a Go field instead of a package level var, there by undoing all atn caching. 2x was unnecessary memory allocation in the hashing code.

@millergarym
Copy link
Contributor Author

@cyberfox thanks for the example.

I need to take back what I said about calling Accept inside a visitor. I know GOF names the method Accept, but we we call it Visit? It make more sense to me (and is closer to the Java code). By the way the Smalltalk design patterns companion names it acceptVisit. I can see how mixing Visit and VisitChildren within the same visit method might even be useful for context sensitive visitations.

I would still like to be able to handle composing visitors. My A-B example is a simple case of composition. This is the one case the delegate implementation can handle that your "super" one can't.

I'll try find some time on the weekend to make some changes.

@millergarym
Copy link
Contributor Author

@davesisson
Copy link
Contributor

davesisson commented May 1, 2017 via email

@millergarym
Copy link
Contributor Author

There is simple iteration, and listener are generally fine for this. When it comes to complex iteration this is where visitors coming into play. The obvious reason for using a visitor is to control what gets visited. However there are inherently complex use-cases that visitors can help address with features like argument passing, return values and composing visitor. Visitors could even be generalised to support object algebras (see Extensibility for the Masses - aka Monads in OO).

Yes technically, the delegate can be dropped from the signature and placed in the the base, and the args can be replaced with a method. But this would significantly complicate complex visitations. It is far easier to maintain state in the call stack than external to it.

As for that Java version. It is quite feature poor, and imo not a particularly good exemplar of what a visitor api/implementation should look like. As a design option, there could be two visitor interfaces stateless/state-full and generate code to support both.

Currently moving some of my interpreters to use visitors, happy to share some grammars if you're interested.

@davesisson
Copy link
Contributor

davesisson commented May 2, 2017 via email

@millergarym
Copy link
Contributor Author

Java and that's already annoying

I generate commented out signatures in the base visitor file and just copy paste.

Why do we need to pass variable args in? Would a struct be sufficient enough?

Can you give an example, I can't see it.
The var args is so that manual Visit called doesn't need a nil.

storing context information about the tree (basically a symbol table)

A scoped symbol table requires a stack. This is trivial via args. Managing the stack external to the call stack is just a pain and unnecessary.

I have no great solution for tail recursion like A+A+A+A+A+A... a thousand times

I haven't thought it, but could simple continuation passing could be implemented in a VisitChildren or manually in a visitX method?

I've been playing with the interfaces and implementation. What do you think about the version below? I honestly think the delegate in every visit signature is preferable to putting it in the BaseParseTreeVisitor. It makes setup so much easier.

Note the current BaseParseTreeVisitor methods don't need a default receiver, therefore a nil one will do. No NewMyV() needed.

type ParseTree interface {
	SyntaxTree

	Visit(visitor ParseTreeVisitor, args ...interface{}) interface{}
	GetText() string

	ToStringTree([]string, Recognizer) string
}

type VisitNextCheck interface {
	VisitNext(next Tree, currentResult interface{}) bool
}
type VisitRestCheck interface {
	VisitRest(next RuleNode, currentResult interface{}) bool
}
type TerminalVisitor interface {
	VisitTerminal(node TerminalNode)
}
type ErrorNodeVisitor interface {
	VisitErrorNode(node ErrorNode)
}
type EnterEveryRuleVisitor interface {
	EnterEveryRule(ctx RuleNode)
}
type ExitEveryRuleVisitor interface {
	ExitEveryRule(ctx RuleNode)
}
type ParseTreeVisitor interface {
	VisitTerminal(node TerminalNode)
	VisitErrorNode(node ErrorNode)
	VisitChildren(node RuleNode, delegate ParseTreeVisitor, args ...interface{}) (result interface{})
}
type AggregateResultVisitor interface {
	AggregateResult(aggregate, nextResult interface{}) (result interface{})
}

// grammar Example;
// a : b ;
// b : 'c' # Y;

// in somefile.go
//go:generate java -jar path/antlr.jar -o parser -package parser -visitor Example.g4

// -- generate example_visitor.go
//
// type AContextVisitor interface {
// 	VisitA(ctx IAContext, delegate antlr.ParseTreeVisitor, args ...interface{}) (result interface{})
// }
// type YContextVisitor interface {
// 	VisitY(ctx IYContext, delegate antlr.ParseTreeVisitor, args ...interface{}) (result interface{})
// }

// -- implemented visitor
// type MyV struct {
//	*antlr.BaseParseTreeVisitor
//}
// func (v *MyV) VisitA(ctx parser.IAContext, delegate antlr.ParseTreeVisitor, args ...interface{}) (result interface{}) {
// 	return
// }
// func (v *MyV) VisitY(ctx parser.IYContext, delegate antlr.ParseTreeVisitor, args ...interface{}) (result interface{}) {
// 	return
// }

type BaseParseTreeVisitor struct{}

func (*BaseParseTreeVisitor) VisitTerminal(node TerminalNode) {}
func (*BaseParseTreeVisitor) VisitErrorNode(node ErrorNode)   {}

func (*BaseParseTreeVisitor) VisitChildren(node RuleNode, delegate ParseTreeVisitor, args ...interface{}) interface{} {
	next, isNextCk := delegate.(VisitNextCheck)
	rest, isRestCk := delegate.(VisitRestCheck)
	entryV, isEnterV := delegate.(EnterEveryRuleVisitor)
	exitV, isExitEV := delegate.(ExitEveryRuleVisitor)
	aggre, isAggre := delegate.(AggregateResultVisitor)
	var result interface{}
	for _, child := range node.GetChildren() {
		if isNextCk && !next.VisitNext(child, result) {
			continue
		}
		switch child := child.(type) {
		case TerminalNode:
			delegate.VisitTerminal(child)
		case ErrorNode:
			delegate.VisitErrorNode(child)
		case RuleNode:
			if isRestCk && !rest.VisitRest(child, result) {
				break
			}
			if isEnterV {
				entryV.EnterEveryRule(child)
			}
			r := child.Visit(delegate, args)
			if isExitEV {
				exitV.ExitEveryRule(child)
			}
			if isAggre {
				result = aggre.AggregateResult(result, r)
			} else {
				result = r
			}
		default:
			// can this happen??
		}
	}
	return result
}

@davesisson
Copy link
Contributor

I think I'm okay with this particular implementation. The implemented visitor example feels a lot like the Java implementation of visitors so that's a plus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants