Rewriter needs to account for "dead code" #2938

retailcoder · 2017-03-30T14:39:55Z

As of recently, all module-rewriting operations are carried through a TokenStreamRewriter. That's a huge leap forward and greatly improves undo-ability and overall module-rewriting accuracy, however there's a problem that wasn't addressed:

TokenStreamRewriter works off the lexer token stream, and that stream is "cleansed" so that the parser can work around a number of limitations:

Line numbers are seen as WS tokens
Precompiler blocks that aren't "live" are seen as WS tokens

Therefore, piping all module changes to TSR's has also introduced a nasty side-effect: we're turning the user's modules into how Rubberduck sees them. And that doesn't include line numbers and "dead" precompiler code paths.

We can't have all precompiler code paths in the token streams, and despite all our efforts anything that tries to get line numbers picked up by a parser rule breaks the grammar.

We can't release anything until we've found a solution to this problem - how do we go about it?

The text was updated successfully, but these errors were encountered:

retailcoder · 2017-03-30T14:53:51Z

One possibility would be, as soon as we have the token stream for a module (well, after it's given to the parser for processing, I suppose), to get its rewriter and immediately replace the WS tokens we introduced with the line numbers and dead preprocessor code paths that we stripped from the module content string.

Hosch250 · 2017-03-30T16:14:40Z

I'll take a crack at getting line numbers in the grammar. Can you provide different use cases I would need to account for?

MDoerner · 2017-03-30T18:59:28Z

Might we be able to solve the problem with the preprocessing directives by sending the parts to ignore to another channel instead of replacing them? Obviously, that would need some external variables in the lexer.

retailcoder · 2017-03-30T20:16:59Z

@MDoerner that's absolutely worth considering! ...except, wouldn't it mean we need embed the preprocessing logic into the lexer itself?

comintern · 2017-03-31T02:53:54Z

@Hosch250 - When I was looking at adding line numbers and labels to the grammar, I determined that it shouldn't be horribly difficult but it will probably require a bit of work. I think the key is to add something like an endOfLine token - basically any newline that doesn't have a line continuation. The main problem is that ANTLR uses greedy matching, so and lineNumber\label token would need to either back-reference to an endOfLine or be explicitly endOfLine (lineNumber | lineLabel). Like I said, shouldn't be difficult - just requires a sh!tload of adjustments to other contexts to make it work.

Re preprocessor directives, would it work if we just sent that token stream to a different ANTLR channel?

MDoerner · 2017-04-28T18:40:57Z

@retailcoder I think we can put them on another channel outside the lexer. The CommonToken has a settable field for the channel. So I think we can rewrite the channel during preprocessing.

retailcoder added antlr Issue is easier to resolve with knowledge of Antlr4 critical Marks a bug as a must-fix, showstopper issue discussion enhancement Feature requests, or enhancements to existing features. Ideas. Anything within the project's scope. labels Mar 30, 2017

MDoerner added a commit to MDoerner/Rubberduck that referenced this issue May 5, 2017

Added a test covering issue rubberduck-vba#2938

56ac3da

MDoerner added a commit to MDoerner/Rubberduck that referenced this issue May 5, 2017

Added a test covering issue rubberduck-vba#2938

3b22c0a

MDoerner mentioned this issue May 7, 2017

Preprocessing Without Changing the Code #3001

Merged

retailcoder assigned MDoerner May 10, 2017

MDoerner mentioned this issue May 11, 2017

Line Numbers #3014

Merged

MDoerner closed this as completed May 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewriter needs to account for "dead code" #2938

Rewriter needs to account for "dead code" #2938

retailcoder commented Mar 30, 2017

retailcoder commented Mar 30, 2017

Hosch250 commented Mar 30, 2017

MDoerner commented Mar 30, 2017

retailcoder commented Mar 30, 2017

comintern commented Mar 31, 2017

MDoerner commented Apr 28, 2017

Rewriter needs to account for "dead code" #2938

Rewriter needs to account for "dead code" #2938

Comments

retailcoder commented Mar 30, 2017

retailcoder commented Mar 30, 2017

Hosch250 commented Mar 30, 2017

MDoerner commented Mar 30, 2017

retailcoder commented Mar 30, 2017

comintern commented Mar 31, 2017

MDoerner commented Apr 28, 2017