New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Javadoc is not parsed correctly in files with only CR newlines #2329

Closed
robertwhitebit opened this Issue Oct 12, 2015 · 4 comments

Comments

@robertwhitebit

robertwhitebit commented Oct 12, 2015

Printing the javadoc tree by using the JavadocTreePrinter from the contribution repository for the checkstyle test file InputNewlineCrAtEndOfFile.java will print following tree:

|--TEXT[7x0] : [\r * Test case for detection of an existing CR newline at EOF, using the \r * NewlineAtEndOfFileCheck.\r * @author Martin Steiger\r *]
|--EOF[7x129] : [<EOF>]

If a file is using just CR line endings, newlines or tokens in the javadoc won't be detected.
Replacing all CR by LF in the file will print the correct javacdoc tree:

|--NEWLINE[7x0] : [\n]
|--LEADING_ASTERISK[8x0] : [ *]
|--TEXT[8x2] : [ Test case for detection of an existing CR newline at EOF, using the ]
|--NEWLINE[8x71] : [\n]
|--LEADING_ASTERISK[9x0] : [ *]
|--TEXT[9x2] : [ NewlineAtEndOfFileCheck.]
|--NEWLINE[9x27] : [\n]
|--LEADING_ASTERISK[10x0] : [ *]
|--WS[10x2] : [ ]
|--JAVADOC_TAG[10x3] : [@author Martin Steiger\n *]
    |--AUTHOR_LITERAL[10x3] : [@author]
    |--WS[10x10] : [ ]
    |--DESCRIPTION[10x11] : [Martin Steiger\n *]
        |--TEXT[10x11] : [Martin Steiger]
        |--NEWLINE[10x25] : [\n]
        |--LEADING_ASTERISK[11x0] : [ *]
|--EOF[11x2] : [<EOF>]

Replacing all CR by Windows line endings CR+LF also does work correctly:

|--NEWLINE[7x0] : [\r\n]
|--LEADING_ASTERISK[8x0] : [ *]
...

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Oct 12, 2015

Member

parsing should be the same no matter what line-ending is in file, the same as for java files.

@baratali , please comment on this issue

Member

romani commented Oct 12, 2015

parsing should be the same no matter what line-ending is in file, the same as for java files.

@baratali , please comment on this issue

@baratali

This comment has been minimized.

Show comment
Hide comment
@baratali

baratali Oct 13, 2015

Contributor

@romani , according to the Javadoc Antlr grammar a newline symbol is '\n' or '\r\n'
https://github.com/checkstyle/checkstyle/blob/master/src/main/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/JavadocLexer.g4#L62
So it is a bug. I can update grammar.

Contributor

baratali commented Oct 13, 2015

@romani , according to the Javadoc Antlr grammar a newline symbol is '\n' or '\r\n'
https://github.com/checkstyle/checkstyle/blob/master/src/main/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/JavadocLexer.g4#L62
So it is a bug. I can update grammar.

@rnveach rnveach added easy and removed easy labels Mar 2, 2017

@Vladlis Vladlis added the GSoC2017 label May 10, 2017

@Vladlis Vladlis added the javadoc label May 16, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue May 29, 2017

@Vladlis Vladlis moved this from To Do to In Progress in Javadoc style coverage and parser optimization May 29, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 5, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 8, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 14, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 14, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 15, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 15, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jun 25, 2017

romani added a commit that referenced this issue Jun 27, 2017

@romani romani added the bug label Jun 27, 2017

@romani romani added this to the 8.0 milestone Jun 27, 2017

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Jun 27, 2017

Member

fix is merged

Member

romani commented Jun 27, 2017

fix is merged

@romani romani closed this Jun 27, 2017

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Jun 27, 2017

Member

Fix was merged but line and column numbers are still incorrect because we rely on ANTLR and ANTLR doesn't and won't support \r line returns correctly by themselves.

Example:
https://github.com/checkstyle/checkstyle/blob/439d094e6ac95d31abc49eea71d2e934b9dde6a2/src/test/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/expectedJavadocWithCrAsNewlineAst.txt
All lines are line 0.

ANTLR issue: antlr/antlr4#1890

Member

rnveach commented Jun 27, 2017

Fix was merged but line and column numbers are still incorrect because we rely on ANTLR and ANTLR doesn't and won't support \r line returns correctly by themselves.

Example:
https://github.com/checkstyle/checkstyle/blob/439d094e6ac95d31abc49eea71d2e934b9dde6a2/src/test/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/expectedJavadocWithCrAsNewlineAst.txt
All lines are line 0.

ANTLR issue: antlr/antlr4#1890

@romani romani changed the title from Javadoc isn't parsed correctly in files with only CR newlines to Javadoc is not parsed correctly in files with only CR newlines Jul 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment