-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8278268 - (ch) InputStream returned by Channels.newInputStream should have fast path for FileChannel targets #6711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back mkarg! A progress list of the required criteria for merging this PR into |
Webrevs
|
3d65823
to
2579cef
Compare
dst.write(bb); | ||
} | ||
bb.clear(); | ||
pos += bytesRead; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been busy so only getting to this PR now.
Since we are guaranteed that src is configured blocking then the only remaining reason for transferFrom to return 0 is when EOF is reached. Can we drop the temporary buffer and just replace the buffer in the block with:
long n;
while ((n = dst.transferFrom(src, pos, Long.MAX_VALUE)) > 0) {
pos += n;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "yes, for the channels being part of the JRE", and have adopted your proposal in 3c1d8af. The only risk I see (and hence I wrote that originally complex code) is that in theory anybody is technically able to implement a custom ReadableByteChannel
(as it is a public interface) which does return zero intermittently while the end of stream is not reached. transferFrom
in that case will also return zero. Hence the loop would terminate (possibly long) before EOF is reached.
So do we ignore this hypothetical case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The read method is specified so that "It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read". So in blocking mode it means that read returns >=1 or -1/EOF. You are correct that a blocking of implementation of RBC that returns 0 would be problematic. In the case of transferFrom then it would assume the channel it at EOF. In other usages it may cause a loop as the caller might not expect 0. A completely broken implementation might return random values (-2 !!) so I don't think we can reliably defend against all cases. So I think the updated implementation in 3c1d8af is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, seems I missed that section about the one-byte-guarantee. Great, so I kindly request approval of this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, seems I missed that section about the one-byte-guarantee. Great, so I kindly request approval of this PR!
I think Lance is going to help on this. We need to study the test update closely as this test has been causing a lot of issues in our CI (timeouts, hangs, ...). I think the main issue with the test right now is JDK-8278369 and it may only be on Windows. I haven't had time myself to study it closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlanBateman Seems Lance has no time (or is possibly on holidays) so I kindly ask for approval again, as everything he asked me for is fixed in this PR since a week already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlanBateman Seems Lance has no time (or is possibly on holidays) so I kindly ask for approval again, as everything he asked me for is fixed in this PR since a week already.
I've been busy and haven't been following the recent discussions about the test. Lance has been busy too but I think he has been looking at the tests. He mentioned that the new TransferTo2 tests actually runs TransferTo due to an issue with the @run tag. I will try to get time to look at the test soon. Once this PR is in integrated then I think JDK-8278369 should be priority, I think that issue may be Windows specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Silly me. Fixed in fc7d00b.
JDK-8278369: Tried several times on my Windows 11 laptop, but still cannot reproduce. Maybe Lance's boxes are simply overloaded / too slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JDK-8278369: Tried several times on my Windows 11 laptop, but still cannot reproduce. Maybe Lance's boxes are simply overloaded / too slow?
I don't know if anyone has reproduced locally, instead it seems to be random Windows machines in our CI. There are differences between Windows client and server editions that periodically cause issues, I don't know if this is the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LanceAndersen AFAIK the problem was not originally caused by this PR itself, but in fact by a problem inside of Windows, and recent JDKs workaround that problem somehow meanwhile. So it should be safe to continue with this PR now. If I am wrong please let me know. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will have to separate out the added changes into its own test as we are failing sporadically on some of the windows boxes due to the execution time.
There is also further cleanup that can be done to existing test. Attached is an example diff based on the proposed changes in this PR
8278268-TransferTo-Cleanup.patch.txt
@LanceAndersen Do I understand correct, ontop of the diff you sent me, you also want me to strip all 2GB tests into a separate test class, correct? |
The updates that you made to add an additional test is resulting in sporadic timeouts on Windows.
|
@LanceAndersen Are you sure that the sporadic timeout comes from the changes made by this PR? IIUC then the sporadic timeouts are found in the master branch, so reverting the changes from this PR do not bring a benefit. It would be better to separate out all (even existing) 2GB tests instead. |
As I mentioned in my earlier comment, we need to make the changes as outlined. We do have some Windows boxes which are older but we cannot afford to have sporadic test failures due to the disruption it causes. And yes, these failures are occurring with the changes made and do not occur if I back them out. So please follow the recommendations, I can then re-test the PR to validate we are clean (or as clean as we ca)n for sporadic failures. |
@LanceAndersen Thanks for the explanation, I (so I hope) did what you asked me for. The original TransferTo test now does not contain the new tests of this PR anymore, but there is a new TransferTo2 which only contain the new tests of this PR. Both contains your proposed cleanups. On my Windows laptop I never have nor had timeout problems. |
@LanceAndersen Requested changes are done since three days. Anything else you want me to change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is additional clean up that can still be done. There is duplication of some methods and constants between both tests and this could be streamlined to further reduce duplication
I have also attached a diff of minor cleanup with the tests as they are additional cleanup to the above work.
As Alan mentioned we are seeing failures in approximately 3% of the runs on various windows boxes. It appears to be an issue with TransferTo.testStreamContents->TransferTo.checkTransferredContents and selectableChannelOutput() as it blocking. I have not been able to determine the cause.
@LanceAndersen I assumed the second test will exist only for the time until we fixed the Windows problems, so I did not invest time to reduce duplication so far. So you really want to keep the separated tests forever and want me to reduce duplication? |
I would not make any such assumptions especially for tests that create large files. So yes please address the duplication. It is not a huge amount of work to create a base class that the tests extend (you will see examples elsewhere within the various test directories). |
@LanceAndersen Ok, will do that. Stay tuned. :-) |
@LanceAndersen I have refactored most of the shared code into a common super class. I deliberately left the 2GB+ tests as (mostly) duplicated code until we sorted out the actual origin of the sporadic failures. If you do not want to wait until then, just tell me, so I will consolidate both 2GB+ test functions into a single one. |
Looking at the fact that selectableChannelOutput() utilizes Channels.newInputStream() it comes to my mind that Trisha Gee reported that Channels.newInputStream() produced an endless hanging of her apparently correct code, and it was gone once she replaced it by Channels.newReader(). While she reproduced that on a Mac, it might be the same cause even on Windows. Trisha currently is working on a reproducer, but she already detected that her endless hanging is definitively a thread waiting for the blocking lock. This sounds similar to what you wrote above. So maybe the problem is not inside of TransferTo, but inside of ChannelInputStream? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the current updates. We are starting to move in the right direction.
Alan and I had an offline conversation and concluded the following changes should be made:
- Rename the base class as it is not an Abstract class so the name is confusing
- The base class should only contain constants and helper/utility methods. The actual tests should reside in the test classes
- That the tests should be broken out as
- A separate test class for each large file test with a more descriptive class name
- A primary TransferTo test class which includes the remaining tests
After the next set of updates, I will kick off some additional Mach5 runs
I realize that we are spending a lot of time on the tests but they are a key aspect of making sure we have adequate coverage for changes such as this and are easy to maintain and understand by future maintainers/developers
@LanceAndersen I will do as you requested, but due to the holidays it will need a few days. Stay tuned and Merry Christmas! |
I think the hang in JDK-8278369 is specific to the Unix domain socket implementation of Pipe on Windows 10+ and Windows Server 2019+. More specifically, I think the issue is closing the sink after a large number of bytes has been written does not always cause a reader on the source side to wakeup with EOF. As a test, I changed the implementation to use TCP sockets unconditional (so it works like Windows 8 or Windows Server 2012 or 2016) and I cannot duplicate the issue. We need to look closer at this in the new year. I looked briefly at the chat server and it has a number of issues. There is a ArrayList mutated and read from several threads without synchronization. There are several issues that are possible when a chat client does not keep up, this will eventually cause the chat server to hang (a thread blocked on the PrintWriter) and all other threads will eventually blocking trying to echo the messages. So from a quick look, I don't think it is anything to do with the issues we are discussing here. |
@AlanBateman Thanks a lot for investing your time. At least it is good news for TransferTo. ;-) @trishagee Can you please double-check that your problem actually is solved solely by replacing streams by writers? |
Signed-off-by: Markus Karg <markus@headcrashing.eu>
I have rebased this PR branch ontop of the current master branch, so it is up-to-date-now. |
@mkarg Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
@bplb Before I stark to refactor anything, it makes sense to briefly discuss your recent proposal:
This is correct and I have no objections to that, but on the other hand, there is no need to keep the separate
So can you please briefly confirm that I shall not merge
I confirm that we had intentionally kept these two tests in separate classes due to the long run time of each, so I agree to keep them as separate classes, but I will reduce duplicate code. I need to mention that the file names should not be changed according to you proposed, as So can you please briefly confirm that I am correct with this objection and the 2GB files should not be renamed? |
That is fine for now. It will be possible to revisit merging at a later time if it appears reasonable.
Sounds good.
That is fine. |
@bplb I am done with the requested changes. There is no duplicated code anymore. :-) |
@bplb Your requested changes are done since one week. Kindly asking to proceed with this PR. Thanks. :-) |
Indeed I was intending to approve this afternoon. I read through it all and the tests look much better now. |
@mkarg This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 1026 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@LanceAndersen, @bplb) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
Thanks a lot for mentoring this PR, Brian. :-) /integrate |
/sponsor |
Going to push as commit 40aea04.
Your commit was automatically rebased without conflicts. |
This sub-issue defines the work to be done to implement JDK-8265891 solely for the particular case of FileChannel.transferFrom(ReadableByteChannel), including special treatment of SelectableByteChannel.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/6711/head:pull/6711
$ git checkout pull/6711
Update a local copy of the PR:
$ git checkout pull/6711
$ git pull https://git.openjdk.org/jdk.git pull/6711/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6711
View PR using the GUI difftool:
$ git pr show -t 6711
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/6711.diff