New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] [cpd] Generalize ANTLR tokens preparing support for ANTLR token filter #1338

Merged
merged 7 commits into from Oct 15, 2018

Conversation

Projects
None yet
4 participants
@matifraga
Contributor

matifraga commented Sep 10, 2018

Please, prefix the PR title with the language it applies to within brackets, such as [java] or [apex]. If not specific to a language, you can use [core]

Before submitting a PR, please check that:

  • The PR is submitted against master. The PMD team will merge back to support branches as needed.
  • ./mvnw test passes.
  • ./mvnw pmd:check passes.
  • ./mvnw checkstyle:check passes. Check this for more info

PR Description:

Team Raptor again, we are submitting another PR preparing ANTLR CPD support for using the TokenFilter feature. This PR depends on #1309 to only see the changes we introduce here please refer to this diff

In order to provide full support for this feature we need to generalize our ANTLR Tokens, so we added GenericAntlrToken into the project. We have a concern here, we found out that GenericToken has a getEndLine method and we didn't quite get the use case of that method.

Another thing we are not proud about, is the need to hardcode the comment token on the tokenizer but we didn't want to add a dependency on the SwiftParser either. Any feedback about enhancing this will be appreciated.

@pmd-test

This comment has been minimized.

pmd-test commented Sep 10, 2018

1 Message
📖 This changeset introduces 0 new violations and 0 new errors,
removes 0 violations and 0 errors. Full report
Please check the regression diff report to make sure that everything is expected

Generated by 🚫 Danger

@jsotuyod

#1309 has been merged. If this gets rebased, auditing the PR would be much easier

public Object getNextToken() {
final Token token = lexer.nextToken();
if (isCommentToken(token.getText())) {
previousComment = new GenericAntlrToken(token, previousComment);

This comment has been minimized.

@jsotuyod

jsotuyod Sep 23, 2018

Member

this variable is never being reset. All comments end up chained even if not being one immediately after the other

}
private boolean isCommentToken(final String text) {
return commentToken != null && text != null && text.startsWith(commentToken);

This comment has been minimized.

@jsotuyod

jsotuyod Sep 23, 2018

Member

this is not gonna cut it. comments can have multiple forms (ie: single-line vs multi-line)

On antlr, the common approach is to use a channel for comments (normally HIDDEN), so you could just identify comments by which channel they come from.

An alternative would be to not use the raw lexer, but wrap it on BufferedTokenStream, which allows for getHiddenTokensToLeft(). (here hidden means, not on the main channel, do not confuse with actually on the HIDDEN channel)

This would be equivalent to what JavaCC's Token class assumes: comments are special tokens, not considered in the AST

tokenEntries.add(TokenEntry.getEOF());
}
}
private static final String COMMENT_TOKEN = "//";

This comment has been minimized.

@jsotuyod

jsotuyod Sep 23, 2018

Member

effectively, this ignores multiline comments

/**
* Generic Antlr representation of a token.
*/
public class GenericAntlrToken implements GenericToken {

This comment has been minimized.

@jsotuyod

jsotuyod Sep 23, 2018

Member

just AntlrToken would do

@jsotuyod jsotuyod added the is:WIP label Sep 25, 2018

import net.sourceforge.pmd.lang.ast.GenericToken;
import com.beust.jcommander.internal.Nullable;

This comment has been minimized.

@jsotuyod

jsotuyod Oct 15, 2018

Member

refrain from using this annotation, as it's internal from a dependency used just for the CLI

@jsotuyod jsotuyod removed the is:WIP label Oct 15, 2018

@jsotuyod jsotuyod added this to the 6.9.0 milestone Oct 15, 2018

@jsotuyod jsotuyod changed the title from [Core] [CPD] Generalize ANTLR tokens preparing support for ANTLR token filter to [core] [cpd] Generalize ANTLR tokens preparing support for ANTLR token filter Oct 15, 2018

@jsotuyod jsotuyod merged commit 921e004 into pmd:master Oct 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

jsotuyod added a commit that referenced this pull request Oct 15, 2018

@matifraga matifraga deleted the teamraptor:generic-token-extended branch Oct 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment