Skip to content

8344706: Compiler Implementation of Compact Source Files and Instance Main Methods #24438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

lahodaj
Copy link
Contributor

@lahodaj lahodaj commented Apr 4, 2025

This is a PR that implements JEP: Compact Source Files and Instance Main Methods. Changes include:

  • java.io.IO moved to java.lang.IO, and no longer uses System.console() to implement the methods (thanks to @stuart-marks)
  • java. ... .IO is no longer automatically imported in any compilation unit
  • the feature is finalized (i.e. no longer requires --enable-preview)

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8353437 to be approved
  • Change requires a JEP request to be targeted

Issues

  • JDK-8344706: Compiler Implementation of Compact Source Files and Instance Main Methods (Sub-task - P4)
  • JDK-8344699: Compact Source Files and Instance Main Methods (JEP)
  • JDK-8353437: Compiler Implementation of Compact Source Files and Instance Main Methods (CSR)

Reviewers

Contributors

  • Stuart Marks <smarks@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24438/head:pull/24438
$ git checkout pull/24438

Update a local copy of the PR:
$ git checkout pull/24438
$ git pull https://git.openjdk.org/jdk.git pull/24438/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24438

View PR using the GUI difftool:
$ git pr show -t 24438

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24438.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 4, 2025

👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

⚠️ @lahodaj This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Apr 4, 2025
@lahodaj
Copy link
Contributor Author

lahodaj commented Apr 4, 2025

/contributor add @stuart-marks

@lahodaj
Copy link
Contributor Author

lahodaj commented Apr 4, 2025

/jep JDK-8344699

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@lahodaj
Contributor Stuart Marks <smarks@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@lahodaj
This pull request will not be integrated until the JEP JDK-8344699 has been targeted.

@openjdk openjdk bot added the jep label Apr 4, 2025
@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@lahodaj The following labels will be automatically applied to this pull request:

  • compiler
  • core-libs
  • kulla

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org compiler compiler-dev@openjdk.org kulla kulla-dev@openjdk.org labels Apr 4, 2025
@stuart-marks stuart-marks mentioned this pull request Apr 4, 2025
3 tasks
*
* @return the internal BufferedReader instance
*/
static synchronized BufferedReader reader() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lock only inteneded for initialization, i.e. should it apply for simply getting the reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there needs to be some barriers (at least volatile) even for getting the already initialized reader. And then, given the code here, it is probably fine and less error prone to simply use synchronized that something more complex.

But, given we'll hopefully have @StableValue soon, maybe we could use that? What do you think @stuart-marks?

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even more complex than volatile. If a thread comes along and reads a null here, it still needs to take a lock (or something) to prevent more than one thread from creating the BufferedReader and storing it back into the field. This leads down the path of double-checked locking, initialize-on-demand holder, etc.

It's not clear to me this is in such a critical path that it warrants adding any of this stuff. StableValue probably has the right semantics. However, it's not even in the JDK yet (though it is Targeted to 25 as of this moment). But I don't think we want to add a dependency on a Preview API.

Comment on lines 121 to 122
System.out.print(obj);
System.out.flush();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth using a local variable to avoid calling print and flush on different streams in case System.out is reassigned in between?

var out = System.out;
out.print(obj);
out.flush();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not that perf sensitive, guess it is fine as-is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a correctness angle. If you print something to a stream you want to flush the same stream. If System.out is changed in between (unlikely, but possible) then output destined for the old stream could be stranded in its buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as suggested; thanks @lukellmann .

Comment on lines +140 to +147
* @throws IOError if an I/O error occurs
*/
public static String readln() {
try {
return reader().readLine();
} catch (IOException ioe) {
throw new IOError(ioe);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was UncheckedIOException considered? It might be better for beginners to learn throwing exceptions instead of errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is not to be emulated by beginners. If an Exception.is thrown here, a user might be tempted to add handlers in their code, while this is an issue with the setup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone is bound to ask why the readln method throw but the println methods don't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IOError was carried over from Console.readLine(), which throws IOError on error. Of course we're decoupled from Console now, but that was the original reason.

My guess is that Console methods throw IOError because (a) they didn't want to make everybody catch IOException, (b) they didn't want to maintain an internal error state like PrintStream, and (c) they wanted to throw a throwable whose type conveyed the meaning that it wasn't expected to be handled. IOError and Console were both added in JDK 1.6. Maybe that reasoning still applies here.

UncheckedIOException wasn't added until JDK 1.8. Note also that UncheckedIOException has a cause of IOException, so it can't arise from some other throwable. This was used in the previous implementation, where IOError was thrown if Console was null. That doesn't apply anymore, but IOError is still somewhat more flexible than UncheckedIOException in this regard.

I think we need to say something, implicitly or explicitly, about error handling. Note that PrintStream has this internal error state so output is covered already. For input it seems like IOError is reasonable, but maybe there's a better choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readln methods handle malformed-input and unmappable-character errors by dropping, and using a replacement value. So erroneous input doesn't throw IOError with a CharacterCodingException as the cause.

System.in.close() would a be wild thing to do, but a case where readln would throw IOError.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Does the specification here need to state the policy around malformed input and unmappable characters? (Or, maybe in the class specification.)

People have ended up in cases where System.in is closed. It can occur when combining use of Scanner to read from System.in and applying the anything-that-you-open-must-be-closed rule, e.g.,

try (Scanner sc = new Scanner(System.in)) {
    String s = sc.nextLine();
}

which is a mistake, but it takes a long time to explain....

Copy link
Member

@stuart-marks stuart-marks Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a clause to the class specification that covers malformed/unmappable byte sequences.

@lahodaj lahodaj marked this pull request as ready for review April 7, 2025 06:57
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 7, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 7, 2025

Webrevs

Copy link
Member

@sormuras sormuras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in source launcher related tests look good.

* If this property is not present, or if the charset it names cannot be loaded, then
* UTF-8 is used instead. These internal objects are created upon the first call to
* either of the {@code readln} methods and are stored for subsequent reuse by these
* methods.
Copy link
Contributor

@AlanBateman AlanBateman Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuart-marks Can we rephrase this paragraph so that it doesn't use phrase "internal objects"? The class does speak of buffering and how it might impact code that mixes use of System.in and IO.readln so I agree with that part it's just the "internal objects" phrase that is confusing to read in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used "internal objects" because I wanted to avoid naming concrete classes, which might or might not be used. Previous drafts mentioned BufferedReader, InputStreamReader, and CharsetDecoder.

I could replace "internal objects" with something more descriptive like "objects to handle buffering and charset decoding" but I'd still need a noun phrase to refer to them later. Maybe "buffering and decoding objects" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rephrase the 1st sentence to read:
"The readln() and readln(String) methods in this class decode bytes read from System.in into characters."

And remove the last "these internal objects" sentence.

And for the next paragraph, change the first line to:
"The readln methods may buffer additional bytes..."

So we remove references to "internal objects."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the midst of rewriting all of this stuff. The main concern here is that applications can use System.in more or less as they please (presumably without buffering) until the first call to a readln method, at which time something happens and effectively the application cannot use System.in anymore. I'll post the revision when I've finished it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there reason to create a test subdirectory IO under java.lang.IO package?

@@ -51,7 +51,7 @@
/*
* @test
* @bug 8305457 8342936 8351435
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue id needs to be added (and other test files too)

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javac compiler related code and tests lgtm

* Charset decoding is set up upon the first call to one of the {@code readln} methods.
* Decoding may buffer additional bytes beyond those that have been decoded to characters
* returned to the application. After the first call to one of the {@code readln} methods,
* any subsequent use of {@code System.in} results in unspecified behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add a note about mixing API usage into System.in too. There are tutorials and books that show examples build on System.in that will add buffering on that input source.

Copy link
Member

@stuart-marks stuart-marks Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea. I'll handle this separately. I don't want it to refer to a Preview API. I'll probably just use Scanner and InputStreamReader as examples. (Oh wait, this supposedly isn't a Preview API anymore...)

* The {@link #readln()} and {@link #readln(String)} methods decode bytes read from
* {@code System.in} into characters. The charset used for decoding is specified by the
* {@link System#getProperties stdin.encoding} property. If this property is not present,
* or if the charset it names cannot be loaded, then UTF-8 is used instead.
Copy link
Contributor

@AlanBateman AlanBateman Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdout.encoding and stderr.encoding are specified (in System.getProperties) to lead to unspecified behavior if started with either property set to a value other than UTF-8. We should work through the issues of introducing stdin.encoding as soon as we can, esp. the redirect cases and whether there is use cases for setting any of these properties on the command line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, believe it or not, I am still studying this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler compiler-dev@openjdk.org core-libs core-libs-dev@openjdk.org csr Pull request needs approved CSR before integration jep kulla kulla-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

8 participants