Skip to content

JDK-8307184: Incorrect/inconsistent specification and implementation for Elements.getDocComment #15062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

jddarcy
Copy link
Member

@jddarcy jddarcy commented Jul 28, 2023

Start by just reformatting the existing specs to highlight subsequent spec changes.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8313343 to be approved

Issues

  • JDK-8307184: Incorrect/inconsistent specification and implementation for Elements.getDocComment (Bug - P3)
  • JDK-8313343: Incorrect/inconsistent specification and implementation for Elements.getDocComment (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15062/head:pull/15062
$ git checkout pull/15062

Update a local copy of the PR:
$ git checkout pull/15062
$ git pull https://git.openjdk.org/jdk.git pull/15062/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15062

View PR using the GUI difftool:
$ git pr show -t 15062

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15062.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 28, 2023

👋 Welcome back darcy! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 28, 2023
@openjdk
Copy link

openjdk bot commented Jul 28, 2023

@jddarcy The following label will be automatically applied to this pull request:

  • compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the compiler compiler-dev@openjdk.org label Jul 28, 2023
@jddarcy
Copy link
Member Author

jddarcy commented Jul 28, 2023

/csr needed

@mlbridge
Copy link

mlbridge bot commented Jul 28, 2023

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Jul 28, 2023
@openjdk
Copy link

openjdk bot commented Jul 28, 2023

@jddarcy has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@jddarcy please create a CSR request for issue JDK-8307184 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jul 28, 2023
@jddarcy
Copy link
Member Author

jddarcy commented Jul 28, 2023

Once we've settled on the spec changes, I'll populate the CSR accordingly.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 28, 2023
@@ -283,15 +283,32 @@ default Set<? extends ModuleElement> getAllModuleElements() {
* <p> A documentation comment of an element is a comment that
* begins with "{@code /**}", ends with a separate
* "<code>*&#47;</code>", and immediately precedes the element,
* ignoring white space. Therefore, a documentation comment
* ignoring white space and annotations and end-of-line-comments ({@code "//"} comments).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor/grammar)
instead of A and B and C consider using A, B, and C

* of the doc comment starting after the initial "{@code /**}",
* if the lines start with <em>zero</em> or more white space characters followed by
* <em>one</em> or more "{@code *}" characters,
* those leading white space characters are discarded as are any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I checked javac and it allows form-feed in the leading whitespace characters

As an adjective "white space" is normally a single word, at least in JDK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. java.lang.Character contains both "whitespace" and "white space" in its textual comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the difference is typically whitespace (adjective) and white space noun. But no matter.

Comment on lines 76 to 80
void stringDiffer(String actual, String expected) {
if (actual.length() != expected.length()) {
System.out.println("Strings have different lengths");
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical, but for bonus points, you could identify the first line that is different.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know of an existing utility usable in the JDK that does this?
(I didn't want to write a utility like that for the purpose of this bug.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToolBox.checkEqual

@jonathan-gibbons
Copy link
Contributor

General feedback: while the text is good at describing when non-newline characters are removed, it is less good at describing the treatment of newlines ... are they line-terminators or line-separators; how do such characters in the result of getDocComment relate to the characters in the source file? For example, does getDocComment "copy" the newline characters found in the source file or are they always normalized to \n ?

* Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
* eiusmod tempor incididunt ut labore et dolore magna aliqua.
*/
@ExpectedComment( // Cannot used a text block here since leading spaces are removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use a text block, by carefully adjusting the indentation of the trailing """

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@jddarcy
Copy link
Member Author

jddarcy commented Aug 4, 2023

General feedback: while the text is good at describing when non-newline characters are removed, it is less good at describing the treatment of newlines ... are they line-terminators or line-separators; how do such characters in the result of getDocComment relate to the characters in the source file? For example, does getDocComment "copy" the newline characters found in the source file or are they always normalized to \n ?

Hmm. Does the precise handling of line terminators matter enough to specify?

@mlbridge
Copy link

mlbridge bot commented Aug 5, 2023

Mailing list message from Alex Buckley on compiler-dev:

On 8/4/2023 1:33 PM, Joe Darcy wrote:

src/java.compiler/share/classes/javax/lang/model/util/Elements.java line 300:

298: * if the lines start with <em>zero</em> or more white space characters followed by
299: * <em>one</em> or more "{@code *}" characters,
300: * those leading white space characters are discarded as are any

FWIW, I checked `javac` and it allows form-feed in the leading whitespace characters

As an adjective "white space" is normally a single word, at least in JDK.

Hmm. java.lang.Character contains both "whitespace" and "white space" in its textual comments.

The JLS is careful to speak only of "white space" as a noun, never as an
adjective for "character". But I think that when the adjective is
needed, it's "whitespace". java.lang.Character follows this rule, e.g.,
"ISO control characters that are not whitespace" (i.e., are not
whitespace characters).

Alex

@jddarcy
Copy link
Member Author

jddarcy commented Aug 6, 2023

General feedback: while the text is good at describing when non-newline characters are removed, it is less good at describing the treatment of newlines ... are they line-terminators or line-separators; how do such characters in the result of getDocComment relate to the characters in the source file? For example, does getDocComment "copy" the newline characters found in the source file or are they always normalized to \n ?

Hmm. Does the precise handling of line terminators matter enough to specify?

PS FWIW, the javac implementation does normalize line terminators, test case added.

@jonathan-gibbons
Copy link
Contributor

General feedback: while the text is good at describing when non-newline characters are removed, it is less good at describing the treatment of newlines ... are they line-terminators or line-separators; how do such characters in the result of getDocComment relate to the characters in the source file? For example, does getDocComment "copy" the newline characters found in the source file or are they always normalized to \n ?

Hmm. Does the precise handling of line terminators matter enough to specify?

Medium yes. For anyone parsing the contents of a doc comment, it helps to know whether the terminators are "as in the source file" or "always \n". I guess if unspecified, users will have to assume the worst case (may be as in the source file) even if that is not typically the case.

System.out.println("Expected");
System.out.println(expectedCommentStr);
stringDiffer(actualComment, expectedCommentStr);
(new ToolBox()).checkEqual(expectedCommentStr.lines().toList(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-)

@jonathan-gibbons
Copy link
Contributor

General feedback: while the text is good at describing when non-newline characters are removed, it is less good at describing the treatment of newlines ... are they line-terminators or line-separators; how do such characters in the result of getDocComment relate to the characters in the source file? For example, does getDocComment "copy" the newline characters found in the source file or are they always normalized to \n ?

Hmm. Does the precise handling of line terminators matter enough to specify?

Medium yes. For anyone parsing the contents of a doc comment, it helps to know whether the terminators are "as in the source file" or "always \n". I guess if unspecified, users will have to assume the worst case (may be as in the source file) even if that is not typically the case.

I guess for anyone doing character-based parsing, handling line-ending sequences is no biggie. For folk doing line-based parsing String.lines() comes to mind ...

@openjdk
Copy link

openjdk bot commented Aug 9, 2023

@jddarcy This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8307184: Incorrect/inconsistent specification and implementation for Elements.getDocComment

Reviewed-by: vromero, jjg

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 43 new commits pushed to the master branch:

  • 593ba2f: 8313693: Introduce an internal utility for the Damerau–Levenshtein distance calculation
  • 360f65d: 8314022: Problem-list tests failing with jtreg 7.3
  • 0eb0997: 8288936: Wrong lock ordering writing G1HeapRegionTypeChange JFR event
  • 19ae62a: 8311170: Simplify and modernize equals and hashCode in security area
  • e9f751a: 8311247: Some cpp files are compiled with -std:c11 flag
  • 213d3c4: 8313891: JFR: Incorrect exception message for RecordedObject::getInt
  • 0e2c72d: 8313796: AsyncGetCallTrace crash on unreadable interpreter method pointer
  • 52ec4bc: 8303056: Improve support for Unicode characters and digits in JavaDoc search
  • 9cf12bb: 8313922: Remove unused WorkerPolicy::_debug_perturbation
  • 6e3cc13: 8312467: relax the builddir check in make/autoconf/basic.m4
  • ... and 33 more: https://git.openjdk.org/jdk/compare/90d795abf10bf8b8b53079c1afd19fee7b4cb6cf...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed csr Pull request needs approved CSR before integration labels Aug 9, 2023
@jddarcy
Copy link
Member Author

jddarcy commented Aug 9, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Aug 9, 2023

Going to push as commit c307391.
Since your change was applied there have been 43 commits pushed to the master branch:

  • 593ba2f: 8313693: Introduce an internal utility for the Damerau–Levenshtein distance calculation
  • 360f65d: 8314022: Problem-list tests failing with jtreg 7.3
  • 0eb0997: 8288936: Wrong lock ordering writing G1HeapRegionTypeChange JFR event
  • 19ae62a: 8311170: Simplify and modernize equals and hashCode in security area
  • e9f751a: 8311247: Some cpp files are compiled with -std:c11 flag
  • 213d3c4: 8313891: JFR: Incorrect exception message for RecordedObject::getInt
  • 0e2c72d: 8313796: AsyncGetCallTrace crash on unreadable interpreter method pointer
  • 52ec4bc: 8303056: Improve support for Unicode characters and digits in JavaDoc search
  • 9cf12bb: 8313922: Remove unused WorkerPolicy::_debug_perturbation
  • 6e3cc13: 8312467: relax the builddir check in make/autoconf/basic.m4
  • ... and 33 more: https://git.openjdk.org/jdk/compare/90d795abf10bf8b8b53079c1afd19fee7b4cb6cf...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 9, 2023
@openjdk openjdk bot closed this Aug 9, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 9, 2023
@openjdk
Copy link

openjdk bot commented Aug 9, 2023

@jddarcy Pushed as commit c307391.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@jddarcy jddarcy deleted the JDK-8307184 branch October 17, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants