Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8191441: (Process) add Readers and Writer access to java.lang.Process streams #4134

Closed
wants to merge 7 commits into from

Conversation

RogerRiggs
Copy link
Contributor

@RogerRiggs RogerRiggs commented May 20, 2021

Methods are added to java.lang.Process to read and write characters and lines from and to a spawned Process.
The Charset used to encode and decode characters to bytes can be specified or use the
operating system native encoding as is available from the "native.encoding" system property.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8191441: (Process) add Readers and Writer access to java.lang.Process streams

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4134/head:pull/4134
$ git checkout pull/4134

Update a local copy of the PR:
$ git checkout pull/4134
$ git pull https://git.openjdk.java.net/jdk pull/4134/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4134

View PR using the GUI difftool:
$ git pr show -t 4134

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4134.diff

@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented May 20, 2021

/csr

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented May 20, 2021

👋 Welcome back rriggs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added rfr csr labels May 20, 2021
@openjdk
Copy link

@openjdk openjdk bot commented May 20, 2021

@RogerRiggs this pull request will not be integrated until the CSR request JDK-8191490 for issue JDK-8191441 has been approved.

@openjdk
Copy link

@openjdk openjdk bot commented May 20, 2021

@RogerRiggs The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs label May 20, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented May 20, 2021

@forax
Copy link
Member

@forax forax commented May 20, 2021

I've added the same comment on the bug itself.
I don't think that using PrintWriter is a good idea here given that a PrintWriter shallow the IOException

@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented May 20, 2021

I don't think that using PrintWriter is a good idea here given that a PrintWriter shallow the IOException

Good point, but...

PrintStream does also and it is used frequently for Stdout and Stderr.

OutputStreamWriter would be a better choice with that in mind. It does not have the convenience methods for converting various types to strings but would not hide the exceptions. Developers could wrap it in a PrintWriter to get the convenience and take on the responsibility of dealing with exceptions by polling.

Copy link
Member

@naotoj naotoj left a comment

Good to see native.encoding is utilized. Some comments follow.

@mlbridge
Copy link

@mlbridge mlbridge bot commented May 20, 2021

Mailing list message from Bernd Eckenfels on core-libs-dev:

Hello,

Hm, how is that list used? - StandardCharaet.ISO_8859_1 is a guaranteed Charset for JVM, and since the encoding is done in Java it should be fine. Added benefit is, it?s 8bit transparent.

As for OS there is not a single standard charset (ebcdic vs latin families) but ASCII is probably the widest available (with latin1 variants to follow)

--
https://Bernd.eckenfels.net
________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.java.net> on behalf of Roger Riggs <rriggs at openjdk.java.net>
Sent: Thursday, May 20, 2021 10:52 PM
To: core-libs-dev at openjdk.java.net
Subject: Re: RFR: 8191441: (Process) add Readers and Writer access to java.lang.Process streams

On Thu, 20 May 2021 20:42:35 GMT, Naoto Sato <naoto at openjdk.org> wrote:

Methods are added to java.lang.Process to read and write characters and lines from and to a spawned Process.
The Charset used to encode and decode characters to bytes can be specified or use the
operating system native encoding as is available from the "native.encoding" system property.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 64:

62: return new Object[][] {
63: {"UTF-8"},
64: {"ISO8859-1"},

`ISO8859-1` may not be available on all underlying OSes.

Is there a safe subset?
I haven't seen a failure yet, if/when it occurs, we make an exception or narrow the test to known systems.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 111:

109: @Test(dataProvider = "CharsetCases", enabled = true)
110: void testCase(String nativeEncoding) throws IOException {
111: String osName = System.getProperty("os.name").toLowerCase(Locale.ROOT);

Not used anywhere else.

Right, dead code now without host dependencies.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 122:

120: "ReaderWriterTest$ChildWithCharset");
121: var env = pb.environment();
122: env.put("LANG", "en_US." + cleanCSName);

Does this work on Windows?

Should be removed, the tests work because they set sun.stdout/stderr.encoding.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4134

1 similar comment
@mlbridge
Copy link

@mlbridge mlbridge bot commented May 20, 2021

Mailing list message from Bernd Eckenfels on core-libs-dev:

Hello,

Hm, how is that list used? - StandardCharaet.ISO_8859_1 is a guaranteed Charset for JVM, and since the encoding is done in Java it should be fine. Added benefit is, it?s 8bit transparent.

As for OS there is not a single standard charset (ebcdic vs latin families) but ASCII is probably the widest available (with latin1 variants to follow)

--
https://Bernd.eckenfels.net
________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.java.net> on behalf of Roger Riggs <rriggs at openjdk.java.net>
Sent: Thursday, May 20, 2021 10:52 PM
To: core-libs-dev at openjdk.java.net
Subject: Re: RFR: 8191441: (Process) add Readers and Writer access to java.lang.Process streams

On Thu, 20 May 2021 20:42:35 GMT, Naoto Sato <naoto at openjdk.org> wrote:

Methods are added to java.lang.Process to read and write characters and lines from and to a spawned Process.
The Charset used to encode and decode characters to bytes can be specified or use the
operating system native encoding as is available from the "native.encoding" system property.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 64:

62: return new Object[][] {
63: {"UTF-8"},
64: {"ISO8859-1"},

`ISO8859-1` may not be available on all underlying OSes.

Is there a safe subset?
I haven't seen a failure yet, if/when it occurs, we make an exception or narrow the test to known systems.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 111:

109: @Test(dataProvider = "CharsetCases", enabled = true)
110: void testCase(String nativeEncoding) throws IOException {
111: String osName = System.getProperty("os.name").toLowerCase(Locale.ROOT);

Not used anywhere else.

Right, dead code now without host dependencies.

test/jdk/java/lang/ProcessBuilder/ReaderWriterTest.java line 122:

120: "ReaderWriterTest$ChildWithCharset");
121: var env = pb.environment();
122: env.put("LANG", "en_US." + cleanCSName);

Does this work on Windows?

Should be removed, the tests work because they set sun.stdout/stderr.encoding.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4134

@mlbridge
Copy link

@mlbridge mlbridge bot commented May 21, 2021

Mailing list message from Roger Riggs on core-libs-dev:

Hi,

The list is used to test the inputReader and errorReader methods that
accept a Charset.

The child process is launched with -Dsun.stdout.encoding and
-Dsun.stderr.encoding
to make the child encode its output in the requested charset (overriding
the native encoding).
Then the parent reads the output with the same charset.
The test will be skipped if the encoding is not supported.

The test for the native charset uses only the native encoding in the
default configuration.

Thanks, Roger

On 5/20/21 5:30 PM, Bernd Eckenfels wrote:

1 similar comment
@mlbridge
Copy link

@mlbridge mlbridge bot commented May 21, 2021

Mailing list message from Roger Riggs on core-libs-dev:

Hi,

The list is used to test the inputReader and errorReader methods that
accept a Charset.

The child process is launched with -Dsun.stdout.encoding and
-Dsun.stderr.encoding
to make the child encode its output in the requested charset (overriding
the native encoding).
Then the parent reads the output with the same charset.
The test will be skipped if the encoding is not supported.

The test for the native charset uses only the native encoding in the
default configuration.

Thanks, Roger

On 5/20/21 5:30 PM, Bernd Eckenfels wrote:

@forax
Copy link
Member

@forax forax commented May 21, 2021

OutputStreamWriter would be a better choice with that in mind. It does not have the convenience methods for converting various types to strings but would not hide the exceptions. Developers could wrap it in a PrintWriter to get the convenience and take on the responsibility of dealing with exceptions by polling.

yes, instead of OutputStreamWriter, i wonder if BufferedWriter is not better given that reader part uses BufferedReader and that is mirror java.nio.file.Files which also uses BufferedReader/BufferedWriter as types for the pair reader/writer.

@mlbridge
Copy link

@mlbridge mlbridge bot commented May 21, 2021

Mailing list message from Roger Riggs on core-libs-dev:

Hi,

Yes, I'm testing BufferedWriter-> OutputStreamWriter now.
And it can be wrapped in PrintWriter if someone is eager to get the
missing text formatting
or auto-flush behaviors.

Roger

On 5/21/21 5:23 PM, R?mi Forax wrote:

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented May 23, 2021

The updated proposal looks reasonable, as does the names of the methods.

I think the javadoc will need to specify how malformed or unmappable byte sequence are handled. The implNote does document that it uses InputStreamReader so this means that erroneous input is replaced but this isn't normative text.

It may be useful to specify how these methods are intended to work when the process has terminated. I realise the existing getXXXX aren't clear on this point and maybe they should.

The API note that warns about unpredictable behavior then trying to use both the byte and character stream probably needs to go into the existing methods too.

NPE will need to be documented as I don't think the Process class description has a statement to avoid clutter in the method descriptions. You will eventually need to add the @SInCE javadoc tag too.

Is there more test coverage to come? I don't see tests that exercise the redirect cases or NPE. The test updates includes an adjustment to how SerialFilterTest is run - is that meant to be included?

A formatting nit is that the line lengths are very really long compared to the rest of the source file.

…ding to

the 1-arg method with a charset, added tests for Redirect cases where the streams are null-input or null-output streams.
@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented May 24, 2021

Thanks for the comments, most are addressed.

The updated proposal looks reasonable, as does the names of the methods.

I think the javadoc will need to specify how malformed or unmappable byte sequence are handled. The implNote does document that it uses InputStreamReader so this means that erroneous input is replaced but this isn't normative text.

It may be useful to specify how these methods are intended to work when the process has terminated. I realise the existing getXXXX aren't clear on this point and maybe they should.

The API note that warns about unpredictable behavior then trying to use both the byte and character stream probably needs to go into the existing methods too.

NPE will need to be documented as I don't think the Process class description has a statement to avoid clutter in the method descriptions. You will eventually need to add the @SInCE javadoc tag too.

Is there more test coverage to come? I don't see tests that exercise the redirect cases or NPE. The test updates includes an adjustment to how SerialFilterTest is run - is that meant to be included?

A formatting nit is that the line lengths are very really long compared to the rest of the source file.

On the question of process termination behavior, I'm not sure what can be said that could be specification.
The implementations vary between Mac, Linux, and Windows; with the common thread
to avoid losing process output. But this may have to be included in the unspecified implementation behavior
or just an API note. Suggestions?

I need to address what happens when these methods are called more than once.
They should return the same instance as the previous call.

src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
src/java.base/share/classes/java/lang/Process.java Outdated Show resolved Hide resolved
@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented May 25, 2021

On the question of process termination behavior, I'm not sure what can be said that could be specification.
The implementations vary between Mac, Linux, and Windows; with the common thread
to avoid losing process output. But this may have to be included in the unspecified implementation behavior
or just an API note. Suggestions?

I think the javadoc could set expectations that the behavior when the process has terminated, and streams have not been redirected, is unspecified. Reading from the process output/error streams may read some or no bytes/characters.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented May 25, 2021

The updated javadoc addresses most of my points. The clarification to inputReader/errorReader about malformed input looks good but we will need the equivalent in outputWriter for the unmappable character case.
I assume the "not null" can be dropped from the description of the charset parameter as NPE is now specified.

Repeated calls to  inputReader, errorReader, and outputWriter now return the same instance
and check for inconsistent charset argument
Added warnings about concurrently use of input/output streams and readers/writers.
@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented May 26, 2021

Process is abstract. Is there any use for these new methods to be overridden?
Perhaps they should be final. The suggestion in the CSR was to document them
using @ implSpec, to acknowledge that the subclass can do something different.

* This is equivalent to:
* {@code
* return new BufferedReader(new InputStreamReader(getInputStream(), charset));
* }
Copy link
Member

@naotoj naotoj May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not seem to be equivalent any longer. Also, should it describe the behavior that it rejects the different Charset after the first invocation?

* @implNote
* This is equivalent to:
* {@code
* return new BufferedReader(new InputStreamReader(getErrorStream(), charset));
* }
Copy link
Member

@naotoj naotoj May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment in inputReader(Charset) stands here too.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented May 27, 2021

Process is abstract. Is there any use for these new methods to be overridden?
Perhaps they should be final.

It's not clear to me that it is useful to extend Process outside of the JDK. Testing, wrapping, ...? It feels like this class wants to be sealed. Maybe we should look at deprecating the no-arg constructor, like Joe did recently with AccessibleObject.

@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented May 28, 2021

Process is abstract. Is there any use for these new methods to be overridden?
Perhaps they should be final.

It's not clear to me that it is useful to extend Process outside of the JDK. Testing, wrapping, ...? It feels like this class wants to be sealed. Maybe we should look at deprecating the no-arg constructor, like Joe did recently with AccessibleObject.

There are subclasses, even in the test suite, see ProcessTools, so its too late to seal it;
It would be a compatibility issue. We can look at deprecation and strength reduction
but in the current time frame (RPD1) I'd suggest making the new methods final.


// Readers and Writers created for this process; so repeated calls return the same object
// All updates must be done while synchronized on this Process.
private BufferedWriter outputWriter = null;
Copy link
Contributor

@AlanBateman AlanBateman Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to explicitly initialise all these fields to null.

* @apiNote
* Use {@link #getOutputStream} and {@link #outputWriter} with extreme care.
* Output to the {@code BufferedWriter} may be held in the buffer until
* {@linkplain BufferedWriter#flush flush} is called.
Copy link
Contributor

@AlanBateman AlanBateman Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will need a bit of wordsmithing to make it clear that its the usage of both streams for the same Process requires great care (as it stands, it reads like the usage of either method is dangerous).

* <p>This method delegates to {@link #inputReader(Charset)} using the
* {@link Charset} named by the {@code native.encoding}
* system property or the {@link Charset#defaultCharset()} if the
* {@code native.encoding} is not supported.
Copy link
Contributor

@AlanBateman AlanBateman Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if the native.encoding is not supported". I think this needs an adjustment to make it clearer that the default charset is used when the value of native.encoding is not a valid charset name.

outputWriter = new BufferedWriter(new OutputStreamWriter(getOutputStream(), charset));
} else {
if (!outputCharset.equals(charset))
throw new IllegalArgumentException("BufferedWriter was created with charset: " + outputCharset);
Copy link
Contributor

@AlanBateman AlanBateman Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that IAE is the right exception here, I think it's closer to IllegalStateException because the first usage of outputWriter has the side effect of setting the charset for the Process's writer. Another option here is to just put it into the "unpredictable" bucket that is using getOutputStream and outputWriter at the same time. In that case, it could just return a new BufferedWriter when it doesn't match the charset of the first usage.

* The {@code BufferedReader} reads and buffers characters from the InputStreamReader.
*
* <p>The first call to this method creates the {@link BufferedReader BufferedReader},
* if called again with the same {@code charset} the same {@code BufferedReader} is returned.
Copy link
Contributor

@AlanBateman AlanBateman Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the same BufferedReader is returned" - a suggestion here to rephrase this to "then the BufferedReader returned by the first call is returned".

Copy link
Contributor

@AlanBateman AlanBateman left a comment

We've gone through a few iterations on the javadoc and I think the latest edition is okay. I don't have time right now for the latest version of the test (I did look at the test in the initial patch).

naotoj
naotoj approved these changes Jun 2, 2021
@openjdk openjdk bot removed the csr label Jun 7, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 7, 2021

@RogerRiggs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8191441: (Process) add Readers and Writer access to java.lang.Process streams

Reviewed-by: naoto, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 252 new commits pushed to the master branch:

  • e663ba9: 8268299: jvms tag produces incorrect URL
  • 3396b69: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests
  • 270ec97: 8268331: Fix crash in humongous object eager reclaim logging
  • ea8274f: 8267875: Shenandoah: Duplicated code in ShenandoahBarrierSetC2::ideal_node()
  • a91f971: 8268267: Remove -Djavatest.security.noSecurityManager=true from jtreg runs
  • 4f9d6b7: 8267465: remove superfluous preview related annotations and test options
  • 728a411: 8268018: remove dead code in commitLimitter
  • 15715a8: 8267924: Misleading G1 eager reclaim detail logging
  • e4d0454: 8267832: SimpleVisitors and Scanners in jdk.compiler should use @implSpec
  • 8130be5: 8268318: Missing comma in copyright header
  • ... and 242 more: https://git.openjdk.java.net/jdk/compare/9425d3de83fe8f4caddef03ffa3f4dd4de58f236...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jun 7, 2021
@RogerRiggs
Copy link
Contributor Author

@RogerRiggs RogerRiggs commented Jun 7, 2021

/integrate

@openjdk openjdk bot closed this Jun 7, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jun 7, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 7, 2021

@RogerRiggs Since your change was applied there have been 253 commits pushed to the master branch:

  • 7e55569: 8261549: Adjust memory size in MTLTexurePool.m
  • e663ba9: 8268299: jvms tag produces incorrect URL
  • 3396b69: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests
  • 270ec97: 8268331: Fix crash in humongous object eager reclaim logging
  • ea8274f: 8267875: Shenandoah: Duplicated code in ShenandoahBarrierSetC2::ideal_node()
  • a91f971: 8268267: Remove -Djavatest.security.noSecurityManager=true from jtreg runs
  • 4f9d6b7: 8267465: remove superfluous preview related annotations and test options
  • 728a411: 8268018: remove dead code in commitLimitter
  • 15715a8: 8267924: Misleading G1 eager reclaim detail logging
  • e4d0454: 8267832: SimpleVisitors and Scanners in jdk.compiler should use @implSpec
  • ... and 243 more: https://git.openjdk.java.net/jdk/compare/9425d3de83fe8f4caddef03ffa3f4dd4de58f236...master

Your commit was automatically rebased without conflicts.

Pushed as commit 81600dc.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@RogerRiggs RogerRiggs deleted the 8191441-process-reader branch Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs integrated
6 participants