Question: status of Go version of Textmapper? #6

mewmew · 2018-10-11T11:33:31Z

Hi Evgeny,

I just came across Textmapper, and having read the Language Reference and the motivation behind the project, it seems to be exactly what I was looking for. Essentially an LR version of ANTLR for Go. I can tell that you have a lot of experience in this domain, as the architecture is well thought out. I still have to dive deep and examine the minute details of the implementation, but my initial reaction of Textmapper is very positive!

Now, of course, I'd like to take tm out for a spin! However, looking at the implementation of tm-go/cmd/textmapper/generate.go, I noticed a TODO in the generate function.

I noticed that you recently ported the Tarjan's algorithm for detecting strongly connected component (in rev 78fc54e). My question is, how far is the Go version of Textmapper from being ready for use?

I'd love to try it out!

Cheerful regards,
Robin

The text was updated successfully, but these errors were encountered:

inspirer · 2018-10-11T19:20:04Z

The Go version is very far from being complete. I think it will take me two more months to finish porting the lexer generator from Java, and then another two quarters for the parser generator. It is not that it is much work per se but rather my lack of time between work and family. I'm committed though. The main thing I want to get from this rewrite is the support of declarative (and transparent) nonterminal inlining, which should become the main tool in resolving grammar ambiguities. I'm also looking into better compression for generated tables. The compression scheme Textmapper currently uses is the same as in Bison, and it does not scale well to large templated grammars. The problem of generating performant static hash maps seems very interesting to me but I don't want to do this in Java.

Meanwhile, use the Java version. It is stable and generates very performant code. On real-world languages, generated parsers in Go gave me ~100-230MB/sec of lexing throughtput and 20-60MB/sec of parsing throughput. It gets slightly better with each Go release, mostly because of improved register allocation within the Go compiler.

I will refresh the documentation in the upcoming weeks to better cover Textmapper advanced features, such as templates, grammar lookaheads, token sets, error recovery best practices, and the arrow notation for producing ASTs.

mewmew · 2018-10-12T00:05:47Z

Thanks a lot for the writeup! It's good to know roughly at what stage the Go port is at, what your plans are for future releases and in particular that you are committed to it!

Performance was actually why I started looking at Textmapper. The intention is to evaluate using Textmapper for parsing LLVM IR assembly, and thus switch from using Gocc to Textmapper in the upcoming release of https://github.com/llir/llvm.

There is still quite a bit to do, but I'd say about 80% of the grammar has been ported from Gocc to Textmapper https://github.com/mewmew/l-tm/blob/master/parser/ll.tm

There is still production actions to write, and that will take the other 80% of the project :)

Once more, thanks for releasing Textmapper to the public!

Cheers,
Robin

mewmew · 2018-10-13T01:11:03Z

There is still quite a bit to do, but I'd say about 80% of the grammar has been ported from Gocc to Textmapper https://github.com/mewmew/l-tm/blob/master/parser/ll.tm

The port is now done. And the performance looks very promising.

On real-world languages, generated parsers in Go gave me ~100-230MB/sec of lexing throughtput and 20-60MB/sec of parsing throughput.

I can validate this claim, as I get a parsing throughput of roughly 45 MB/s. Have not yet done the semantic actions for constructing the AST though, so hope that won't bring the performance down too much.

Extract from mewspring/mewmew-l#6 (comment):

Parsing 1,733,842 lines and 135 MB of LLVM IR assembly, as contained in the 107 source files at decomp/testdata took ~3 seconds; thus ~30ms was used per file, or ~45 MB/s.

mewmew · 2018-10-14T19:41:52Z

Just a note, the more I use Textmapper the more remarkable I think it is. Evgeny, what you have managed to do is quite an achievement! I've never come across a parser generator before, where the grammar ends up being so readable as the one in Textmapper. I'm quite amazed how well the LLVM IR grammar seem to turn out.

Simply wanted to extend a thank you!

Hats off and with respect.
Robin

inspirer · 2019-02-09T22:28:53Z

Thanks for good words, Robin!

A quick update from me: the Go version reached feature parity with its Java counterpart in lexer generation. It produces byte-for-byte identical output for most grammars, and I'm now working on porting the parser generator. I believe I'm past the midpoint of the rewrite.

mewmew · 2019-02-09T23:03:31Z

A quick update from me: the Go version reached feature parity with its Java counterpart in lexer generation. It produces byte-for-byte identical output for most grammars, and I'm now working on porting the parser generator. I believe I'm past the midpoint of the rewrite.

That is really wonderful to hear! Thanks for the update.

Wish you the best of springs and happy coding ahead :)

tmm1 · 2021-08-02T23:44:53Z

I'm trying to start using the golang textmapper, and I'm not sure if there's a feature missing or I'm doing something wrong.

I started by simply trying to regenerate the simple parser, but the parser.go and listener.go files are not being generated:

$ cd tm-go/parsers/simple
$ rm *.go
$ ../../cmd/textmapper/textmapper generate simple.tm
$ git status
Changes not staged for commit:
	deleted:    listener.go
	deleted:    parser.go

What am I missing?

EDIT: I found an example with the correct commands here: https://github.com/llir/grammar/blob/5291534192d972964c2745b7c18ac47208dc6be5/Makefile#L5-L7

inspirer · 2023-08-27T22:30:15Z

Textmapper is fully rewritten in Go.

Run go install github.com/inspirer/textmapper/cmd/textmapper@latest to install it locally.

In most cases the rewrite is a drop-in replacement for the Java version but there are a few places where the new tool produces slightly different output (mostly in identifiers) or is more strict to grammar errors. Expect the following errors:

similar names in the grammar (capitalization, camel vs snake case, etc.) cause a grammar compilation error to avoid confusion and actual compilation errors down the road
declarative lookaheads are properly checked to be mutually exclusive (the previous implementation was too lenient)
unused patterns get reported
syntax sugar is processed in a slightly different order, which in some cases produces a different output
(label? -> Foo) is now correctly reported as an empty node when 'label' is missing. Rewrite it as (label -> Foo)?.

There is a new flag --compat which tries to reduce the variation in the generated code between the versions.

Important: the new version uses https://pkg.go.dev/text/template as the templating language. If you override any templates in your grammar, you'll have to update them. Under the --compat flag Texmapper tries to translate previous templates into new templates but this breaks pretty quickly on advanced grammars.

As the first step during the migration, run textmapper generate --diff --compat to see any new errors and the difference in generated code (compares generated code vs the on-disk state).

Bonus: a new grammar option optimizeTables = true speeds up large grammars by 30-80%.

I've successfully migrated dozens of grammars recently and the new version is handling them well. Please let me know if you get into any issues.

mewmew mentioned this issue Oct 12, 2018

Parsing is slow relative to LLVM mewspring/mewmew-l#6

Closed

mewmew referenced this issue Jun 17, 2019

Compute final parser states.

96177df

mewmew mentioned this issue Dec 2, 2021

cmd/go: track tool dependencies in go.mod golang/go#48429

Open

inspirer closed this as completed Aug 27, 2023

mewmew mentioned this issue Aug 28, 2023

Go version of textmapper: 'noUnwind' and 'nounwind' get the same ID in generated code (potential workaround?) #70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: status of Go version of Textmapper? #6

Question: status of Go version of Textmapper? #6

mewmew commented Oct 11, 2018

inspirer commented Oct 11, 2018

mewmew commented Oct 12, 2018

mewmew commented Oct 13, 2018 •

edited

mewmew commented Oct 14, 2018 •

edited

inspirer commented Feb 9, 2019

mewmew commented Feb 9, 2019 •

edited

tmm1 commented Aug 2, 2021 •

edited

inspirer commented Aug 27, 2023

Question: status of Go version of Textmapper? #6

Question: status of Go version of Textmapper? #6

Comments

mewmew commented Oct 11, 2018

inspirer commented Oct 11, 2018

mewmew commented Oct 12, 2018

mewmew commented Oct 13, 2018 • edited

mewmew commented Oct 14, 2018 • edited

inspirer commented Feb 9, 2019

mewmew commented Feb 9, 2019 • edited

tmm1 commented Aug 2, 2021 • edited

inspirer commented Aug 27, 2023

mewmew commented Oct 13, 2018 •

edited

mewmew commented Oct 14, 2018 •

edited

mewmew commented Feb 9, 2019 •

edited

tmm1 commented Aug 2, 2021 •

edited