Skip to content

8278268 - (ch) InputStream returned by Channels.newInputStream should have fast path for FileChannel targets #6711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

mkarg
Copy link
Contributor

@mkarg mkarg commented Dec 5, 2021

This sub-issue defines the work to be done to implement JDK-8265891 solely for the particular case of FileChannel.transferFrom(ReadableByteChannel), including special treatment of SelectableByteChannel.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8278268: (ch) InputStream returned by Channels.newInputStream should have fast path for FileChannel targets

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/6711/head:pull/6711
$ git checkout pull/6711

Update a local copy of the PR:
$ git checkout pull/6711
$ git pull https://git.openjdk.org/jdk.git pull/6711/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6711

View PR using the GUI difftool:
$ git pr show -t 6711

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/6711.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 5, 2021

👋 Welcome back mkarg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 5, 2021
@openjdk
Copy link

openjdk bot commented Dec 5, 2021

@mkarg The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio nio-dev@openjdk.org label Dec 5, 2021
@mlbridge
Copy link

mlbridge bot commented Dec 5, 2021

@openjdk openjdk bot added rfr Pull request is ready for review and removed rfr Pull request is ready for review labels Dec 5, 2021
@mkarg mkarg force-pushed the 8278268 branch 2 times, most recently from 3d65823 to 2579cef Compare December 11, 2021 16:51
dst.write(bb);
}
bb.clear();
pos += bytesRead;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been busy so only getting to this PR now.

Since we are guaranteed that src is configured blocking then the only remaining reason for transferFrom to return 0 is when EOF is reached. Can we drop the temporary buffer and just replace the buffer in the block with:

            long n;
            while ((n = dst.transferFrom(src, pos, Long.MAX_VALUE)) > 0) {
                pos += n;
            }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "yes, for the channels being part of the JRE", and have adopted your proposal in 3c1d8af. The only risk I see (and hence I wrote that originally complex code) is that in theory anybody is technically able to implement a custom ReadableByteChannel (as it is a public interface) which does return zero intermittently while the end of stream is not reached. transferFrom in that case will also return zero. Hence the loop would terminate (possibly long) before EOF is reached.

So do we ignore this hypothetical case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read method is specified so that "It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read". So in blocking mode it means that read returns >=1 or -1/EOF. You are correct that a blocking of implementation of RBC that returns 0 would be problematic. In the case of transferFrom then it would assume the channel it at EOF. In other usages it may cause a loop as the caller might not expect 0. A completely broken implementation might return random values (-2 !!) so I don't think we can reliably defend against all cases. So I think the updated implementation in 3c1d8af is okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, seems I missed that section about the one-byte-guarantee. Great, so I kindly request approval of this PR!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, seems I missed that section about the one-byte-guarantee. Great, so I kindly request approval of this PR!

I think Lance is going to help on this. We need to study the test update closely as this test has been causing a lot of issues in our CI (timeouts, hangs, ...). I think the main issue with the test right now is JDK-8278369 and it may only be on Windows. I haven't had time myself to study it closely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlanBateman Seems Lance has no time (or is possibly on holidays) so I kindly ask for approval again, as everything he asked me for is fixed in this PR since a week already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlanBateman Seems Lance has no time (or is possibly on holidays) so I kindly ask for approval again, as everything he asked me for is fixed in this PR since a week already.

I've been busy and haven't been following the recent discussions about the test. Lance has been busy too but I think he has been looking at the tests. He mentioned that the new TransferTo2 tests actually runs TransferTo due to an issue with the @run tag. I will try to get time to look at the test soon. Once this PR is in integrated then I think JDK-8278369 should be priority, I think that issue may be Windows specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Silly me. Fixed in fc7d00b.

JDK-8278369: Tried several times on my Windows 11 laptop, but still cannot reproduce. Maybe Lance's boxes are simply overloaded / too slow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JDK-8278369: Tried several times on my Windows 11 laptop, but still cannot reproduce. Maybe Lance's boxes are simply overloaded / too slow?

I don't know if anyone has reproduced locally, instead it seems to be random Windows machines in our CI. There are differences between Windows client and server editions that periodically cause issues, I don't know if this is the case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LanceAndersen AFAIK the problem was not originally caused by this PR itself, but in fact by a problem inside of Windows, and recent JDKs workaround that problem somehow meanwhile. So it should be safe to continue with this PR now. If I am wrong please let me know. Thanks.

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to separate out the added changes into its own test as we are failing sporadically on some of the windows boxes due to the execution time.

There is also further cleanup that can be done to existing test. Attached is an example diff based on the proposed changes in this PR
8278268-TransferTo-Cleanup.patch.txt

@mkarg
Copy link
Contributor Author

mkarg commented Dec 16, 2021

We will have to separate out the added changes into its own test as we are failing sporadically on some of the windows boxes due to the execution time.

@LanceAndersen Do I understand correct, ontop of the diff you sent me, you also want me to strip all 2GB tests into a separate test class, correct?

@LanceAndersen
Copy link
Contributor

We will have to separate out the added changes into its own test as we are failing sporadically on some of the windows boxes due to the execution time.

@LanceAndersen Do I understand correct, ontop of the diff you sent me, you also want me to strip all 2GB tests into a separate test class, correct?

The updates that you made to add an additional test is resulting in sporadic timeouts on Windows.

  • Please revert TransferTo.Java to omit the changes you made to it as part of this PR
    • Yes, we should cleanup TransferTo.java while we are addressing this test area
  • Add a new test file to address your changes to ChannelInputStream found in this PR

@mkarg
Copy link
Contributor Author

mkarg commented Dec 16, 2021

@LanceAndersen Are you sure that the sporadic timeout comes from the changes made by this PR? IIUC then the sporadic timeouts are found in the master branch, so reverting the changes from this PR do not bring a benefit. It would be better to separate out all (even existing) 2GB tests instead.

@LanceAndersen
Copy link
Contributor

@LanceAndersen Are you sure that the sporadic timeout comes from the changes made by this PR? IIUC then the sporadic timeouts are found in the master branch, so reverting the changes from this PR do not bring a benefit. It would be better to separate out all (even existing) 2GB tests instead.

As I mentioned in my earlier comment, we need to make the changes as outlined. We do have some Windows boxes which are older but we cannot afford to have sporadic test failures due to the disruption it causes. And yes, these failures are occurring with the changes made and do not occur if I back them out.

So please follow the recommendations, I can then re-test the PR to validate we are clean (or as clean as we ca)n for sporadic failures.

@mkarg
Copy link
Contributor Author

mkarg commented Dec 16, 2021

@LanceAndersen Thanks for the explanation, I (so I hope) did what you asked me for. The original TransferTo test now does not contain the new tests of this PR anymore, but there is a new TransferTo2 which only contain the new tests of this PR. Both contains your proposed cleanups. On my Windows laptop I never have nor had timeout problems.

@mkarg
Copy link
Contributor Author

mkarg commented Dec 19, 2021

@LanceAndersen Requested changes are done since three days. Anything else you want me to change?

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is additional clean up that can still be done. There is duplication of some methods and constants between both tests and this could be streamlined to further reduce duplication

I have also attached a diff of minor cleanup with the tests as they are additional cleanup to the above work.

As Alan mentioned we are seeing failures in approximately 3% of the runs on various windows boxes. It appears to be an issue with TransferTo.testStreamContents->TransferTo.checkTransferredContents and selectableChannelOutput() as it blocking. I have not been able to determine the cause.

testcleanup-2.patch.txt

@mkarg
Copy link
Contributor Author

mkarg commented Dec 22, 2021

@LanceAndersen I assumed the second test will exist only for the time until we fixed the Windows problems, so I did not invest time to reduce duplication so far. So you really want to keep the separated tests forever and want me to reduce duplication?

@LanceAndersen
Copy link
Contributor

@LanceAndersen I assumed the second test will exist only for the time until we fixed the Windows problems, so I did not invest time to reduce duplication so far. So you really want to keep the separated tests forever and want me to reduce duplication?

I would not make any such assumptions especially for tests that create large files. So yes please address the duplication. It is not a huge amount of work to create a base class that the tests extend (you will see examples elsewhere within the various test directories).

@mkarg
Copy link
Contributor Author

mkarg commented Dec 22, 2021

@LanceAndersen Ok, will do that. Stay tuned. :-)

@mkarg
Copy link
Contributor Author

mkarg commented Dec 23, 2021

@LanceAndersen I have refactored most of the shared code into a common super class. I deliberately left the 2GB+ tests as (mostly) duplicated code until we sorted out the actual origin of the sporadic failures. If you do not want to wait until then, just tell me, so I will consolidate both 2GB+ test functions into a single one.

@mkarg
Copy link
Contributor Author

mkarg commented Dec 23, 2021

As Alan mentioned we are seeing failures in approximately 3% of the runs on various windows boxes. It appears to be an issue with TransferTo.testStreamContents->TransferTo.checkTransferredContents and selectableChannelOutput() as it blocking. I have not been able to determine the cause.

Looking at the fact that selectableChannelOutput() utilizes Channels.newInputStream() it comes to my mind that Trisha Gee reported that Channels.newInputStream() produced an endless hanging of her apparently correct code, and it was gone once she replaced it by Channels.newReader(). While she reproduced that on a Mac, it might be the same cause even on Windows. Trisha currently is working on a reproducer, but she already detected that her endless hanging is definitively a thread waiting for the blocking lock. This sounds similar to what you wrote above. So maybe the problem is not inside of TransferTo, but inside of ChannelInputStream?

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the current updates. We are starting to move in the right direction.

Alan and I had an offline conversation and concluded the following changes should be made:

  • Rename the base class as it is not an Abstract class so the name is confusing
  • The base class should only contain constants and helper/utility methods. The actual tests should reside in the test classes
  • That the tests should be broken out as
    • A separate test class for each large file test with a more descriptive class name
    • A primary TransferTo test class which includes the remaining tests

After the next set of updates, I will kick off some additional Mach5 runs

I realize that we are spending a lot of time on the tests but they are a key aspect of making sure we have adequate coverage for changes such as this and are easy to maintain and understand by future maintainers/developers

@mkarg
Copy link
Contributor Author

mkarg commented Dec 24, 2021

@LanceAndersen I will do as you requested, but due to the holidays it will need a few days. Stay tuned and Merry Christmas!

@AlanBateman
Copy link
Contributor

Looking at the fact that selectableChannelOutput() utilizes Channels.newInputStream() it comes to my mind that Trisha Gee reported that Channels.newInputStream() produced an endless hanging of her apparently correct code, and it was gone once she replaced it by Channels.newReader(). While she reproduced that on a Mac, it might be the same cause even on Windows. Trisha currently is working on a reproducer, but she already detected that her endless hanging is definitively a thread waiting for the blocking lock. This sounds similar to what you wrote above. So maybe the problem is not inside of TransferTo, but inside of ChannelInputStream?

I think the hang in JDK-8278369 is specific to the Unix domain socket implementation of Pipe on Windows 10+ and Windows Server 2019+. More specifically, I think the issue is closing the sink after a large number of bytes has been written does not always cause a reader on the source side to wakeup with EOF. As a test, I changed the implementation to use TCP sockets unconditional (so it works like Windows 8 or Windows Server 2012 or 2016) and I cannot duplicate the issue. We need to look closer at this in the new year.

I looked briefly at the chat server and it has a number of issues. There is a ArrayList mutated and read from several threads without synchronization. There are several issues that are possible when a chat client does not keep up, this will eventually cause the chat server to hang (a thread blocked on the PrintWriter) and all other threads will eventually blocking trying to echo the messages. So from a quick look, I don't think it is anything to do with the issues we are discussing here.

@mkarg
Copy link
Contributor Author

mkarg commented Dec 25, 2021

@AlanBateman Thanks a lot for investing your time. At least it is good news for TransferTo. ;-)

@trishagee Can you please double-check that your problem actually is solved solely by replacing streams by writers?

mkarg added 2 commits March 12, 2023 15:21
Signed-off-by: Markus Karg <markus@headcrashing.eu>
@mkarg
Copy link
Contributor Author

mkarg commented Mar 12, 2023

Before anything else, you might consider merging the current master into your branch. Aside from the copyright year change of the most recent commit, the branch is over two and a half months out of date.

I have rebased this PR branch ontop of the current master branch, so it is up-to-date-now.

@openjdk
Copy link

openjdk bot commented Mar 12, 2023

@mkarg Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@mkarg
Copy link
Contributor Author

mkarg commented Mar 12, 2023

@bplb Before I stark to refactor anything, it makes sense to briefly discuss your recent proposal:

In the test classes TransferTo and TransferTo2, the methods testNullPointerException and testStreamContents are identical aside from expecting different DataProviders. The contents of the methods could be abstracted out to common methods called by different Tests which expect different DataProviders.

This is correct and I have no objections to that, but on the other hand, there is no need to keep the separate TransferTo2 class separated, as the reason for the existence of TransferTo2 was a bug somewhere else in OpenJDK that does not
exist anymore (for reference, the bug was in the Pipe implementation on Windows). Do that fact, Lance proposed to merge these files:

I haven't gone through this in detail, but we want to merge The remaining TransferTo2 tests into TransferTo.

So can you please briefly confirm that I shall not merge TransferTo2 into TransferTo as Lance proposed (hence TransferTo2 will not be a separate class anymore) but instead I only shall abstract testNullPointerException and testStreamContents as you proposed (hence keep TransferTo2 as a separate class)? Thanks.

...I think the two tests *_2GB* should remain separate as they likely each have a sufficiently long run time.

I confirm that we had intentionally kept these two tests in separate classes due to the long run time of each, so I agree to keep them as separate classes, but I will reduce duplicate code.

I need to mention that the file names should not be changed according to you proposed, as *_transferFrom is actually not testing the method TransferFrom, but it is testing the method TransferTo's implementation which internally uses transferFrom. Hence it would be wrong w.r.t to the general test file naming convention to rename the test to TransferFrom.

So can you please briefly confirm that I am correct with this objection and the 2GB files should not be renamed?

@bplb
Copy link
Member

bplb commented Mar 17, 2023

So can you please briefly confirm that I shall not merge TransferTo2 into TransferTo as Lance proposed (hence TransferTo2 will not be a separate class anymore) but instead I only shall abstract testNullPointerException and testStreamContents as you proposed (hence keep TransferTo2 as a separate class)? Thanks.

That is fine for now. It will be possible to revisit merging at a later time if it appears reasonable.

...I think the two tests *_2GB* should remain separate as they likely each have a sufficiently long run time.

I confirm that we had intentionally kept these two tests in separate classes due to the long run time of each, so I agree to keep them as separate classes, but I will reduce duplicate code.

Sounds good.

So can you please briefly confirm that I am correct with this objection and the 2GB files should not be renamed?

That is fine.

@mkarg
Copy link
Contributor Author

mkarg commented Mar 18, 2023

@bplb I am done with the requested changes. There is no duplicated code anymore. :-)

@mkarg
Copy link
Contributor Author

mkarg commented Mar 31, 2023

@bplb Your requested changes are done since one week. Kindly asking to proceed with this PR. Thanks. :-)

@bplb
Copy link
Member

bplb commented Mar 31, 2023

@bplb Your requested changes are done since one week. Kindly asking to proceed with this PR. Thanks. :-)

Indeed I was intending to approve this afternoon. I read through it all and the tests look much better now.

@openjdk
Copy link

openjdk bot commented Mar 31, 2023

@mkarg This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8278268: (ch) InputStream returned by Channels.newInputStream should have fast path for FileChannel targets

Reviewed-by: bpb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1026 new commits pushed to the master branch:

  • abfb900: 8304028: Port fdlibm IEEEremainder to Java
  • a565be4: 8297605: improve DelayQueue removal method javadoc
  • cccb019: 8304928: Optimize ClassDesc.resolveConstantDesc
  • bdbf8fc: 8303930: Fix ConstantUtils.skipOverFieldSignature void case return value
  • 4a5d7ca: 8305227: [s390x] build broken after JDK-8231349
  • dae1ab3: 8304844: JFR: Missing disk parameter in ActiveRecording event
  • e012685: 8305066: [JVMCI] guarantee(ik->is_initialized()) failed: java/lang/Long$LongCache must be initialized
  • fe42312: 8304820: Statically allocate ObjectSynchronizer mutexes
  • 2f36eb0: 8305323: Update java/net/httpclient/ContentLengthHeaderTest.java to use new HttpTestServer factory methods
  • 049b953: 8305223: IGV: mark osr compiled graphs with [OSR] in the name
  • ... and 1016 more: https://git.openjdk.org/jdk/compare/d85243f02b34d03bd7af63a5bcbc73f500f720df...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@LanceAndersen, @bplb) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 31, 2023
@mkarg
Copy link
Contributor Author

mkarg commented Mar 31, 2023

Thanks a lot for mentoring this PR, Brian. :-)

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 31, 2023
@openjdk
Copy link

openjdk bot commented Mar 31, 2023

@mkarg
Your change (at version 8b7425b) is now ready to be sponsored by a Committer.

@bplb
Copy link
Member

bplb commented Apr 3, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Apr 3, 2023

Going to push as commit 40aea04.
Since your change was applied there have been 1044 commits pushed to the master branch:

  • 9b9b5a7: 8302323: Add repeat methods to StringBuilder/StringBuffer
  • dd7ca75: 8305478: [REDO] disable gtest/NMTGtests.java sub-tests failing due to JDK-8305414
  • f9827ad: 8288109: HttpExchangeImpl.setAttribute does not allow null value after JDK-8266897
  • 6010de0: 8305417: disable gtest/NMTGtests.java sub-tests failing due to JDK-8305414
  • 127afd3: 8241613: Suspicious calls to MacroAssembler::null_check(Register, offset)
  • 33d09e5: 8305247: On RISC-V generate_fixed_frame() sometimes generate a relativized locals value which is way too large
  • 790aced: 8305100: [REDO] Clean up JavadocTokenizer
  • 2e91585: 8303123: Add line break opportunity to single type parameters
  • 094e03d: 8299718: JavaDoc: Buttons to copy specific documentation URL are not accessible
  • 4de24cd: 8303210: [linux, Windows] Make UseSystemMemoryBarrier available as product flag
  • ... and 1034 more: https://git.openjdk.org/jdk/compare/d85243f02b34d03bd7af63a5bcbc73f500f720df...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 3, 2023
@openjdk openjdk bot closed this Apr 3, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Apr 3, 2023
@openjdk
Copy link

openjdk bot commented Apr 3, 2023

@bplb @mkarg Pushed as commit 40aea04.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mkarg mkarg deleted the 8278268 branch April 3, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

5 participants