Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274548: (fc) FileChannel gathering write fails with IOException "Invalid argument" on macOS 11.6 #5831

Closed
wants to merge 7 commits into from

Conversation

bplb
Copy link
Member

@bplb bplb commented Oct 6, 2021

On macOS, handle the case where writev() is given an array of struct iovec the sum of whose iov_len fields overflows INT_MAX.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8274548: (fc) FileChannel gathering write fails with IOException "Invalid argument" on macOS 11.6

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5831/head:pull/5831
$ git checkout pull/5831

Update a local copy of the PR:
$ git checkout pull/5831
$ git pull https://git.openjdk.java.net/jdk pull/5831/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5831

View PR using the GUI difftool:
$ git pr show -t 5831

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5831.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Oct 6, 2021

👋 Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@bplb
Copy link
Member Author

@bplb bplb commented Oct 6, 2021

The writev() function on Linux is specified to fail if:

     EINVAL The sum of the iov_len values in the iov array would
            overflow an ssize_t.

and similarly on macOS:

     [EINVAL] The sum of the iov_len values in the iov array
              overflows a 32-bit integer.

On Linux, writev() was observed not to fail for this case but to return 0x7ffff000, the maximum number of bytes that sendfile() can transfer. This of course might not be the case on all Linux variants.

The error could be handled by a code change in IOUtil but then it would affect all platforms. Also, on macOS the problem has only been observed on one version of the operating system so the added native code would not be always be called even though compiled. It would however probably be harmless to conditionally compile the code for Linux as well as macOS.

Testing has verified the fix on macOS 11.6 and earlier macOS versions with no effect (of course) on Linux.

@openjdk openjdk bot added the rfr label Oct 6, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Oct 6, 2021

@bplb The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio label Oct 6, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 6, 2021

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Oct 6, 2021

We really need someone from Apple here to tell us if this is a man page issue or a bug in macOS 11.6. If I read it correctly, the issue doesn't occur with 10.5.x and the beta of macOS 12, is that correct? If we do need to put a workaround in for this then I think it will need an overhaul of IOUtil first. I'd prefer not have the iovec array be modified in two places.

@bplb
Copy link
Member Author

@bplb bplb commented Oct 6, 2021

V1 moves the logic up to the Java level and removes all native code changes.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 6, 2021

Mailing list message from Brian Burkhalter on nio-dev:

On Oct 5, 2021, at 11:55 PM, Alan Bateman <alanb at openjdk.java.net<mailto:alanb at openjdk.java.net>> wrote:

If I read it correctly, the issue doesn't occur with 10.5.x and the best of macOS 12, is that correct?

That is correct: change in behavior was seen on 11.6 only; 10.5.x and 12-beta are identical and do not manifest it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/nio-dev/attachments/20211006/ce6f68a9/attachment.htm>

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Oct 7, 2021

The latest version impacts all platforms. I think the first step here is to make contact with Apple and try to detect if this is an intended change in macOS 11.6 or not. The observation that they reverted it again in macOS 12 beta suggests it was unintentional but I think we need to find out more.

@bplb
Copy link
Member Author

@bplb bplb commented Oct 7, 2021

Not affecting all platforms is why it was in the C code in the first place.

Apple indicates it was an intentional change in macOS 11.x where x >= 0. I don't know yet about 12.

@bplb
Copy link
Member Author

@bplb bplb commented Oct 7, 2021

My original evaluation on macOS 12-beta was incorrect: I re-tested and the problem occurs there as well. Therefore we are seeing it on macOS x.y.z where x >= 11.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Oct 8, 2021

One approach is try to add a writevMax method along side of iovMax that returns the maximum value of the sum of the iov_len. If this is merged with your v2 then it might not be too bad.

@bplb
Copy link
Member Author

@bplb bplb commented Oct 8, 2021

v4 was verified to pass the test included in this PR on Linux-aarch64, Linux-x64, macOS 10.15.7 and 11.6, and Windows (Server 2016).

// [EINVAL] The sum of the iov_len values in the iov array
// overflows a 32-bit integer.
//
int darwin_version = get_darwin_version();
Copy link
Contributor

@AlanBateman AlanBateman Oct 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you would think about dropping the OS version and just return Integer.MAX_VALUE on macOS? That would align with the EINVAL documented in the man page on all versions.

In passing, an inconsistency has kept into the native code. In some places we are using ifdef APPLE and in others ifdef MACOX. We should probably clean this up.

Copy link
Member Author

@bplb bplb Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm find with dropping the OS version check on macOS and removing the Darwin version function.

I also noticed the inconsistency in the APPLE and MACOSX symbolic constants. There's also another one like ALL_BSD_SOURCE. I don't know whether we consider that redundant as well.

Copy link
Contributor

@AlanBateman AlanBateman Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm find with dropping the OS version check on macOS and removing the Darwin version function.

I think it will align with the man page and remove any questions as to why this is macOS version specific.

I also noticed the inconsistency in the APPLE and MACOSX symbolic constants. There's also another one like ALL_BSD_SOURCE. I don't know whether we consider that redundant as well.

We can look at this in a separate issue.

Copy link
Member Author

@bplb bplb Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and dropping the OS version check leaves fewer moving parts.

On the conditional compilation symbolic constants I filed JDK-8275070.

@openjdk
Copy link

@openjdk openjdk bot commented Oct 11, 2021

@bplb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274548: (fc) FileChannel gathering write fails with IOException "Invalid argument" on macOS 11.6

Reviewed-by: alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 76 new commits pushed to the master branch:

  • 75f5145: 8274925: Shenandoah: shenandoah/TestAllocHumongousFragment.java test failed on lock rank check
  • 83c3771: 8273881: Metaspace: test repeated deallocations
  • 3f01d03: 8275021: Test serviceability/sa/TestJmapCore.java fails with: java.io.IOException: Stack frame 0x4 not found
  • 3f07337: 8273614: Shenandoah: intermittent timeout with ConcurrentGCBreakpoint tests
  • 0d80f6c: 8274379: Allow process of unsafe access errors in check_special_condition_for_native_trans
  • b870468: 8274347: Passing a nested switch expression as a parameter causes an NPE during compile
  • 110e38d: 8274753: ZGC: SEGV in MetaspaceShared::link_shared_classes
  • b7af890: 8274430: Remove some debug error printing code added in JDK-8017163
  • aaf2401: 8274927: Remove unnecessary G1ArchiveAllocator code
  • c55dd36: 8275008: gtest build failure due to stringop-overflow warning with gcc11
  • ... and 66 more: https://git.openjdk.java.net/jdk/compare/bb0bab57a1ff447bfb41cfe10c91838a6812b93d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Oct 11, 2021
@bplb
Copy link
Member Author

@bplb bplb commented Oct 12, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Oct 12, 2021

Going to push as commit 07b1f1c.
Since your change was applied there have been 91 commits pushed to the master branch:

  • f623460: 8274911: testlibrary_tests/ir_framework/tests/TestIRMatching.java fails with "java.lang.RuntimeException: Should have thrown exception"
  • e393c5e: 8275074: Cleanup unused code in JFR LeakProfiler
  • e16b93a: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers
  • 1ab6414: 8275051: Shenandoah: Correct ordering of requested gc cause and gc request flag
  • b460d6d: 8275091: /src/jdk.management.jfr/share/classes/module-info.java has non-canonical order
  • d04d4ee: 8274894: Use Optional.empty() instead of ofNullable(null) in HttpResponse.BodySubscribers.discarding
  • 33050f8: 8274986: max code printed in hs-err logs should be configurable
  • 8de2636: 8274615: Support relaxed atomic add for linux-aarch64
  • 7d2633f: 8275002: Remove unused AbstractStringBuilder.MAX_ARRAY_SIZE
  • cfe7471: 8177814: jdk/editpad is not in jdk TEST.groups
  • ... and 81 more: https://git.openjdk.java.net/jdk/compare/bb0bab57a1ff447bfb41cfe10c91838a6812b93d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Oct 12, 2021
@openjdk openjdk bot added integrated and removed ready labels Oct 12, 2021
@openjdk openjdk bot removed the rfr label Oct 12, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Oct 12, 2021

@bplb Pushed as commit 07b1f1c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@bplb bplb deleted the FileChannel-writev-8274548 branch Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated nio
2 participants