Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8027682: javac wrongly accepts semicolons in package and import decls #12448

Closed
wants to merge 9 commits into from

Conversation

archiecobbs
Copy link
Contributor

@archiecobbs archiecobbs commented Feb 6, 2023

JLS §7.3 specifies that while a lone semicolon is a valid TopLevelClassOrInterfaceDeclaration, it is not a valid ImportDeclaration. Therefore, if we see a lone semicolon while looking for the next import statement we have to advance and accept a class declaration, and we can therefore no longer accept any further import statements.

However, the compiler was allowing this, for example:

package p; import X; ;;; import Y; class Foo {}

The bug is that the parser was switching out of "look for imports" mode after parsing a valid class declaration, but it was not switching out of "look for imports" mode after parsing a lone semicolon.

The fix to JavacParser.java is easy, however it also requires these adjustments to unit tests:

  • Test tree/T6963934.java must go away, because it is verifying a bug that only happens when the above bogus input is successfully parsed, and this can no longer happen.
  • A bug in lib/types/TypeHarness.java was uncovered and fixed; it was inserting an extra semicolon.
  • The following bugs, which check invalid syntax within import statements, now generate different parse errors and therefor needed their "golden output" updated:
    • annotations/typeAnnotations/failures/AnnotatedImport.java
    • annotations/typeAnnotations/failures/AnnotatedPackage1.java
    • annotations/typeAnnotations/failures/AnnotatedPackage2.java

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8301905 to be approved

Issues

  • JDK-8027682: javac wrongly accepts semicolons in package and import decls
  • JDK-8301905: javac wrongly accepts semicolons in package and import decls (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12448/head:pull/12448
$ git checkout pull/12448

Update a local copy of the PR:
$ git checkout pull/12448
$ git pull https://git.openjdk.org/jdk.git pull/12448/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12448

View PR using the GUI difftool:
$ git pr show -t 12448

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12448.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 6, 2023

👋 Welcome back archiecobbs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 6, 2023
@openjdk
Copy link

openjdk bot commented Feb 6, 2023

@archiecobbs The following label will be automatically applied to this pull request:

  • compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the compiler compiler-dev@openjdk.org label Feb 6, 2023
@mlbridge
Copy link

mlbridge bot commented Feb 6, 2023

@cushon
Copy link
Contributor

cushon commented Feb 6, 2023

I would be happy to see this fixed, but note that there's going to be some compatibility impact from this to existing programs containing extra ;, and it probably warrants a CSR.

There was some discussion in the original compiler-dev thread about whether the right fix was to change javac, or change the spec.

@jonathan-gibbons
Copy link
Contributor

I would question whether this is the correct/best solution.

A different solution might be to look ahead for the next keyword (i.e. not-a-semi-colon). If the next keyword is import, then you could report an error, ignore the semi-colon and carry on parsing. In other words, look to improve the error recovery in these situations. I'm reminded of a similar-but-different change a few years back to allow the compiler to parse (and reject) if (expr) declaration because it improved the error recovery.

And, maybe check whether we want to tweak the language rules to permit these semicolons, in the same way that we permit them between top-level declarations.

@archiecobbs
Copy link
Contributor Author

Thanks for the comments.

Agree that smarter error checking for the likely scenario of a spurious semi-colon would be an improvement.

I'm agnostic on what would be "best" - just trying to address some stale bugs.

If there is uncertainty about what the spec should be, but also no action, that obviously paralyzes progress. What do you guys recommend we do?

This bug will have its 10th birthday in October :)

@jonathan-gibbons
Copy link
Contributor

Thanks for the comments.

Agree that smarter error checking for the likely scenario of a spurious semi-colon would be an improvement.

I'm agnostic on what would be "best" - just trying to address some stale bugs.

If there is uncertainty about what the spec should be, but also no action, that obviously paralyzes progress. What do you guys recommend we do?

This bug will have its 10th birthday in October :)

I would suggest to go ahead with a solution to report and ignore excess semicolons, since changing JLS is a bigger deal that cannot be done here. And even if we change JLS going forward, a solution to "report and ignore" excess semicolons might be worth back-porting.

@jddarcy
Copy link
Member

jddarcy commented Feb 6, 2023

If javac were to start rejecting previously accepted programs, even programs that violate the letter of the JLS, that would require a CSR, as @cushon noted.

@archiecobbs
Copy link
Contributor Author

I would suggest to go ahead with a solution to report and ignore excess semicolons, since changing JLS is a bigger deal that cannot be done here. And even if we change JLS going forward, a solution to "report and ignore" excess semicolons might be worth back-porting.

Sounds like a good plan - thanks!

@archiecobbs
Copy link
Contributor Author

/csr

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Feb 6, 2023
@openjdk
Copy link

openjdk bot commented Feb 6, 2023

@archiecobbs has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@archiecobbs please create a CSR request for issue JDK-8027682 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

Comment on lines 3827 to 3830
// JLS 7.3 doesn't allow extra semi-colons after package or import declarations,
// but here we try to provide a more helpful error message if we encounter any.
// Do that by slurping in as many semi-colons as possible, and then seeing what
// comes after before deciding how best to handle them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's my fault for misleading you in the discussion, but the dictionary says that "semicolon" is not a hyphenated word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's my fault for misleading you in the discussion, but the dictionary says that "semicolon" is not a hyphenated word.

Nope, totally my fault. Apparently I've been spelling it wrong for years. Will fix.

@vicente-romero-oracle
Copy link
Contributor

vicente-romero-oracle commented Feb 7, 2023

@archiecobbs I have created the CSR: https://bugs.openjdk.org/browse/JDK-8301905 for you. I guess you are able to create it too but just in case

@vicente-romero-oracle
Copy link
Contributor

vicente-romero-oracle commented Feb 7, 2023

@archiecobbs we use JavacParser::peekToken to look ahead, this method could be useful here

@amaembo
Copy link
Contributor

amaembo commented Feb 7, 2023

Just for information. I've search for the import [\w.]+;;\s+import regex across Java sources corpus. The whole corpus is ~31,600,000 files from ~245,000 repositories. The query matched 896 files from 434 repositories.

A little bit more relaxed query import [\w.]+;(\s*;)+\s+import matches 1,034 files from 493 repositories. So we may estimate that about 0.2% of projects may have compilation error after this change.

Here are particular examples:

import javax.naming.InvalidNameException;;
(in OpenJDK, it's still there!)

https://github.com/bpupadhyaya/openjdk-8/blob/45af329463a45955ea2759b89cb0ebfe40570c3f/jdk/src/macosx/classes/sun/font/CFont.java#L31 (ok, this one was fixed later)

https://github.com/mvpleung/QingTingRadio/blob/83c48010f596db82fe09c9f5619987be5e76f14c/QingTingFM/src/com/google/protobuf/GeneratedMessageLite.java#L3

https://github.com/wallysonlima/mypocket/blob/30038ad6a2b3f18ce0af17e7c4a0f07bbf5db8b8/app/src/main/java/wallyson/com/br/mypocket/view/SpendingYearActivity.java#L4

https://github.com/l33tnoob/MT65x2_kernel_lk/blob/f48b462fe90862b30ea067698167c0a55458eec3/mediatek/packages/apps/Stk1/src/com/android/stk/StkAppService.java#L112

https://github.com/alibaba/dragonwell17/blob/586085bb39516e39ffb616ece3e2ca60fd59d225/test/jdk/java/lang/constant/methodTypeDesc/ResolveConstantDesc.java#L38

https://github.com/ninethousand9000/ninehackv1/blob/6da59fb5d25c52fab5ed2978c298d3392133dc7d/src/main/java/me/ninethousand/ninehack/feature/features/client/HUD.java#L7

https://github.com/erincandescent/Impeller/blob/b4c30710975cb49a4223c9352fba032842e91a28/src/eu/e43/impeller/activity/SettingsActivity.java#L10

https://github.com/swarmcom/jSynapse/blob/70587ce0c16d4cb9e318d6cc0cfd414f2a6d4a84/src/main/java/org/swarmcom/jsynapse/service/accesstoken/AccessTokenServiceImpl.java#L21

https://github.com/ging/fiware-draco/blob/c0fe547f95a0e475aaa892ff8f72185901e88bcc/nifi-ngsi-bundle/nifi-ngsi-processors/src/main/java/org/apache/nifi/processors/ngsi/NGSIToPostgreSQL.java#L21

https://github.com/anyaudio/anyaudio-android-app/blob/31b7d033a5b7a303c786ea7f02849e302588b6f0/app/src/main/java/any/audio/Activity/UpdateThemedActivity.java#L10

https://github.com/HewlettPackard/mds/blob/1883702cbe11ae9866a486a938ffe306dda67fbe/java-api/src/com/hpl/mds/impl/TaskOptionImpl.java#L31

Some of them are outdated and was fixed later, some are Android-specific, so probably we don't care much, but there are also totally legitimate cases.

@archiecobbs
Copy link
Contributor Author

@vicente-romero-oracle wrote:

I have created the CSR

Thanks for doing that. I've updated it, hopefully that's sufficient.

we use JavacParser::peekToken to look ahead, this method could be useful here

Thanks, I wasn't aware of that method. I don't think it helps here though... the issue is I needed to consume arbitrarily many semi-colons at once, so that N consecutive semi-colons only generate one error, but also keep track of their source code positions in case they actually turn out to be valid TopLevelClassOrInterfaceDeclaration(s).

@amaembo wrote:

So we may estimate that about 0.2% of projects may have compilation error after this change.

Thanks, that's really helpful to know.. I've included this info in the CSR.

@vicente-romero-oracle
Copy link
Contributor

vicente-romero-oracle commented Feb 7, 2023

some comments on the CSR:

  • Solution section
    I would remove comments that are implementation details. Probably a matter of style but I would simply go like:
    Disallow extraneous semicolons prior to import statements.
  • Specification section
    I don't think that the links to the PR and the bug belong here as those are already linked to the CSR. I would mention the related section of the spec and probably paste the specific text that applies here. I see the before and after behavior probably more in the Solutions

The comment about how many projects could be affected probably belongs to the Compatibility Risk Description field, which you can access by clicking on the Edit button, of the CSR or to a Notes additional section after the Specification section.

@archiecobbs
Copy link
Contributor Author

some comments on the CSR:

Thanks - updated.

@vicente-romero-oracle
Copy link
Contributor

some comments on the CSR:

Thanks - updated.

thanks, the CSR looks good to me

@vicente-romero-oracle
Copy link
Contributor

vicente-romero-oracle commented Feb 9, 2023

I was playing with this test case:

import java.util.NO;
    ;
import java.io.NO;

only the new extraneous semicolon error is reported but the compiler doesn't report the fact that there is no class named NO inside of java.util or java.io I think that those other errors should be reported too

@archiecobbs
Copy link
Contributor Author

the new extraneous semicolon error is reported but the compiler doesn't report the fact that there is no class named NO inside of java.util or java.io I think that those other errors should be reported too

Hmm, this seems like an orthogonal issue to me.

In other words, the problem you're describing already exists, and it's just being uncovered by this change.

For example, if you compile this class:

import java.util.NO;
while
import java.io.NO;

you get this result:

Test.java:2: error: class, interface, enum, or record expected
while
^
1 error

If you compile this class:

import java.util.NO;
(
import java.io.NO;

you get this result:

Test.java:2: error: class, interface, enum, or record expected
(
^
1 error

The fact that a syntax error inhibits resolution of imported package names is happening somewhere else downstream in the compiler.

I'm not opposed to fixing that, but it seems like it should be filed as a separate issue, because (as the above examples show) it's a more general problem than just with extra semicolons.

@vicente-romero-oracle
Copy link
Contributor

the new extraneous semicolon error is reported but the compiler doesn't report the fact that there is no class named NO inside of java.util or java.io I think that those other errors should be reported too

Hmm, this seems like an orthogonal issue to me.

In other words, the problem you're describing already exists, and it's just being uncovered by this change.

For example, if you compile this class:

import java.util.NO;
while
import java.io.NO;

you get this result:

Test.java:2: error: class, interface, enum, or record expected
while
^
1 error

If you compile this class:

import java.util.NO;
(
import java.io.NO;

you get this result:

Test.java:2: error: class, interface, enum, or record expected
(
^
1 error

The fact that a syntax error inhibits resolution of imported package names is happening somewhere else downstream in the compiler.

I'm not opposed to fixing that, but it seems like it should be filed as a separate issue, because (as the above examples show) it's a more general problem than just with extra semicolons.

yep you are right, my bad, nvm

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks sensible

This improves backward compatibility as requested in the CSR (JDK-8301905).
import java.util.Map;; // NOTE: extra semi-colon
import java.util.Set;

class ExtraneousSemiColonError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should have a dedicated test for the new feature, apart from the diagnostics oriented ones. The test could be executed twice once with option -source 20 and once without it and provided two golden files accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - done. The regular unit tests are more precise.


// key: compiler.warn.extraneous.semicolon
// key: compiler.warn.source.no.system.modules.path
// options: --source 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use "--release 20" instead if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use "--release 20" instead if possible.

Thanks - that works better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is any --release option required? The test should only be run by JDK 20 and later, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per this CSR comment it was requested to make this an error for source versions >= 21 and a warning for source versions prior to 21.

In order to satisfy diags/CheckExamples.java I needed an example that would generate the warning.

But I guess you're right... so the alternative would be to add compiler.warn.extraneous.semicolon to diags/examples.not-yet.txt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry, I wasn't clear enough; my fault.

It is good to add an example (much better than editing the not-yet.txt file.).

The question was why any option specifying a version number is perceived to be required. What happens if you remove the option completely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you remove the option completely?

Then the diags/CheckExamples.java test fails when it runs against ExtraImportSemicolonWarning.java, because that source file will then generate an error instead of a warning as expected.

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 11, 2023

@archiecobbs This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@archiecobbs
Copy link
Contributor Author

/pingbot

@openjdk
Copy link

openjdk bot commented Mar 12, 2023

@archiecobbs Unknown command pingbot - for a list of valid commands use /help.

@openjdk openjdk bot removed the csr Pull request needs approved CSR before integration label Mar 21, 2023
@openjdk
Copy link

openjdk bot commented Mar 21, 2023

@archiecobbs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8027682: javac wrongly accepts semicolons in package and import decls

Reviewed-by: vromero

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 673 new commits pushed to the master branch:

  • d788a1b: 8304180: Constant Descriptors for MethodHandles::classData and classDataAt
  • bbde215: 8299494: Test vmTestbase/nsk/stress/except/except011.java failed: ExceptionInInitializerError: target class not found
  • 1c04686: 8304387: Fix positions of shared static stubs / trampolines
  • c65bb2c: 8304334: java/awt/color/ICC_ColorSpace/ToFromCIEXYZRoundTrip.java times out on slow platforms
  • 4bf1fbb: 8303648: Add String.indexOf(String str, int beginIndex, int endIndex)
  • c4df9b5: 8304537: Ant-based langtools build fails after JDK-8015831 Add lint check for calling overridable methods from a constructor
  • a6b72f5: 8304230: LShift ideal transform assertion
  • a72ba38: 8303948: HsErrFileUtils.checkHsErrFileContent() fails to check the last pattern.
  • bbca7c3: 8304542: Convert use of internal VM::classFileVersion to ClassFileFormatVersion
  • f593a6b: 8303018: Unicode Emoji Properties
  • ... and 663 more: https://git.openjdk.org/jdk/compare/960c3931337b314417ad33d8a775ee3e251692d7...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@vicente-romero-oracle) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 21, 2023
@archiecobbs
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 21, 2023
@openjdk
Copy link

openjdk bot commented Mar 21, 2023

@archiecobbs
Your change (at version 9067163) is now ready to be sponsored by a Committer.

@vicente-romero-oracle
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 23, 2023

Going to push as commit 4b8f7db.
Since your change was applied there have been 697 commits pushed to the master branch:

  • c00d088: 8043179: Lambda expression can mutate final field
  • 147f347: 8219083: java/net/MulticastSocket/SetGetNetworkInterfaceTest.java failed in same binary run on windows x64
  • bf917ba: 8304687: Move add_to_hierarchy
  • 63d4afb: 8304671: javac regression: Compilation with --release 8 fails on underscore in enum identifiers
  • e2cfcfb: 6817009: Action.SELECTED_KEY not toggled when using key binding
  • af4d560: 8303951: Add asserts before record_method_not_compilable where possible
  • c433862: 6245410: javax.swing.text.html.CSS.Attribute: BACKGROUND_POSITION is not w3c spec compliant
  • 91f407d: 8029301: Confusing error message for array creation method reference
  • e73411a: 8304376: Rename t1/t2 classes in com/sun/jdi/CLETest.java to avoid class duplication error in IDE
  • a2d8f63: 8288730: Add type parameter to Lookup::accessClass and Lookup::ensureInitialized
  • ... and 687 more: https://git.openjdk.org/jdk/compare/960c3931337b314417ad33d8a775ee3e251692d7...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 23, 2023
@openjdk openjdk bot closed this Mar 23, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 23, 2023
@openjdk
Copy link

openjdk bot commented Mar 23, 2023

@vicente-romero-oracle @archiecobbs Pushed as commit 4b8f7db.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler compiler-dev@openjdk.org integrated Pull request has been integrated
6 participants