Skip to content
This repository has been archived by the owner. It is now read-only.

JDK-8265369 [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory" #44

Closed
wants to merge 5 commits into from

Conversation

msheppar
Copy link
Contributor

@msheppar msheppar commented Jun 14, 2021

JDK-8265369 [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory"

The test java/net/MulticastSocket/Promiscuous.java has been observed to fail on a regular basis on macosx-aarch.
This is typically under heavy test load on a test machine. Analysis of the problem have
shown that the setsockopt for joining a multicast group will intermittently fail with ENOMEM.

While analysis of test environment shows significant memory usage and some memory pressure, it is
not excessive and as such it is deemed transition or temporary condition, such that a retry of the
setsockopt system call, has been seen to mitigate the issue. This adds to the stability of the
Promiscuous.java test and reduces test failure noise.

The proposed fix is in open/src/java.base/unix/native/libnet/PlainDatagramSocketImpl.c
in the mcast_join_leave function. That is, if setsockopt to join an mcast group fails, and the errno == ENOMEM,
then re-invoke the setsockopt system call for joining a mcast group.
The change has been applied as a conditional compilation.
Additionally this change result in the Promiscuous.java test being removed from the
ProblemList.txt.

Please oblige and review the changes for a fix of the issue JDK-8265369


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8265369: [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory"

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk17 pull/44/head:pull/44
$ git checkout pull/44

Update a local copy of the PR:
$ git checkout pull/44
$ git pull https://git.openjdk.java.net/jdk17 pull/44/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 44

View PR using the GUI difftool:
$ git pr show -t 44

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk17/pull/44.diff

… failed with "SocketException: Cannot allocate memory

The test java/net/MulticastSocket/Promiscuous.java has been observed to fail on a regular basis on macosx-aarch.
This is typically under have test load on a test machine. Analysis of the problem have
shown that the setsockopt for joining a multicast group will intermittently fail with ENOMEM.

The proposed fix is in open/src/java.base/unix/native/libnet/PlainDatagramSocketImpl.c
in the mcast_join_leave function. The change has been applied as a conditional compilation.
Additionally this change result in the Promiscuous.java test being removed from the
ProblemList.txt.
@bridgekeeper
Copy link

bridgekeeper bot commented Jun 14, 2021

👋 Welcome back msheppar! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 14, 2021

@msheppar The following label will be automatically applied to this pull request:

  • net

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the net net-dev@openjdk.java.net label Jun 14, 2021
@msheppar msheppar changed the title JDK-8265369 [macos-aarch64] java/net/MulticastSocket/Promiscuous.java… [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory" Jun 14, 2021
msheppar added 2 commits Jun 14, 2021
… failed with "SocketException: Cannot allocate memory"

Remove trailing white spaces
… failed with "SocketException: Cannot allocate memory"

Remove another trailing white spaces
@msheppar msheppar changed the title [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory" JDK-8265369 [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory" Jun 14, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 14, 2021
@mlbridge
Copy link

mlbridge bot commented Jun 14, 2021

Webrevs

}
}
} else {
#endif
Copy link
Member

@ChrisHegarty ChrisHegarty Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handling of ENOMEM here is consistent with how it is handled in the NIO area - good.

I wonder if a little restructuring may simplify the call flow? For example, something similar to:

  int n;
  ...
  /*
   * Join the multicast group.
   */
   n = setsockopt(fd, IPPROTO_IP, (join ? IP_ADD_MEMBERSHIP:IP_DROP_MEMBERSHIP),
                          (char *) &mname, mname_len);
#ifdef __APPLE__
   // workaround macOS bug where IP_ADD/DROP_MEMBERSHIP fails intermittently
   if (n < 0 && errno == ENOMEM) {
      n = setsockopt(fd, IPPROTO_IP, (join ? IP_ADD_MEMBERSHIP:IP_DROP_MEMBERSHIP),
                                          (char *) &mname, mname_len);
   }
#endif

  if (n < 0) {  ...

Copy link
Contributor Author

@msheppar msheppar Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll do that ... originally I did this but reverted to current change to retain the current structure of the file
thanks for the suggestion, it is much neater

… failed with "SocketException: Cannot allocate memory"

amendments as per suggestion from Chris Hegarty
dfuch
dfuch approved these changes Jun 17, 2021
Copy link
Member

@dfuch dfuch left a comment

Thanks for the update Mark. It makes the code much easier to follow.

@openjdk
Copy link

openjdk bot commented Jun 17, 2021

@msheppar This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8265369: [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory"

Reviewed-by: dfuchs, michaelm, chegar

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 48 new commits pushed to the master branch:

  • 7ed3634: 8268405: Several regressions 4-17% after CHA changes
  • 483f1ee: 8268678: LetsEncryptCA.java test fails as Let’s Encrypt Authority X3 is retired
  • 80dc262: 8265500: Some impls of javax.crypto.Cipher.init() do not throw UnsupportedOperationExc for unsupported modes
  • 9130b8a: 8268371: C2: assert(_gvn.type(obj)->higher_equal(tjp)) failed: cast_up is no longer needed
  • 8545269: 8268676: assert(!ik->is_interface() && !ik->has_subklass()) failed: inconsistent klass hierarchy
  • c98d508: 8268265: MutableSpaceUsedHelper::take_sample() hits assert(left >= right) failed: avoid overflow
  • b66001a: 8268971: ProblemList tools/jpackage/windows/WinInstallerIconTest.java on win-x64
  • 0011b52: 8264843: Javac crashes with NullPointerException when finding unencoded XML in
     tag
  • 2047da7: 8265297: javax/net/ssl/SSLSession/TestEnabledProtocols.java failed with "RuntimeException: java.net.SocketException: Connection reset"
  • 091bc4a: 8268353: Test libsvml.so is and is not present in jdk image
  • ... and 38 more: https://git.openjdk.java.net/jdk17/compare/e39346e708a06cdee2b9a096f08c1cfe2e21dfc2...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 17, 2021
@@ -1837,6 +1837,9 @@ static void mcast_join_leave(JNIEnv *env, jobject this,
jint fd;
jint family;
jint ipv6_join_leave;
#ifdef __APPLE__
int res;
#endif

Copy link
Member

@ChrisHegarty ChrisHegarty Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res will need to be declared unconditionally, no? ( since it is used on all platforms )

Copy link
Contributor Author

@msheppar msheppar Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ... good spot 👍 ..... as testing has been focused on macos, this error didn't show

if (join) {
NET_ThrowCurrent(env, "setsockopt " S_ADD_MEMBERSHIP " failed");
NET_ThrowCurrent(env, "setsockopt " S_ADD_MEMBERSHIP " failed");
} else {
Copy link
Member

@ChrisHegarty ChrisHegarty Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental indentation here? ( four spaces added in front of NET_ThrowCurrent )

if (setsockopt(fd, IPPROTO_IPV6, (join ? ADD_MEMBERSHIP : DRP_MEMBERSHIP),
(char *) &mname6, sizeof (mname6)) < 0) {
res = setsockopt(fd, IPPROTO_IPV6, (join ? ADD_MEMBERSHIP : DRP_MEMBERSHIP),
(char *) &mname6, sizeof (mname6));

Copy link
Member

@Michael-Mc-Mahon Michael-Mc-Mahon Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be an extraneous space after sizeof here.

Copy link
Member

@Michael-Mc-Mahon Michael-Mc-Mahon Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general question. Is "APPLE" the preferred macro name or MACOSX? Not a big deal but MACOSX looks slightly more common. [Seems github has removed the underscores from APPLE]

Copy link
Contributor Author

@msheppar msheppar Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks ... I'll attend to the spacing
I used APPLE as it was already in place for a change to set socket buffer size

… failed with "SocketException: Cannot allocate memory"

remove #ifdef __APPLE__ from int res; declaration, remove space and fix indentation (as per comments from CH and M3)
dfuch
dfuch approved these changes Jun 18, 2021
@msheppar
Copy link
Contributor Author

msheppar commented Jun 18, 2021

/integrate

@openjdk
Copy link

openjdk bot commented Jun 18, 2021

Going to push as commit d8a0582.
Since your change was applied there have been 59 commits pushed to the master branch:

  • 21abcc4: 8268564: mark hotspot serviceability/attach tests which ignore external VM flags
  • f83c6b8: 8268531: mark SDTProbesGNULinuxTest as ignoring external VM flags
  • 8366c69: 8268541: mark hotspot serviceability/sa tests which ignore external VM flags
  • 5b19898: 8268563: mark hotspot serviceability/jvmti tests which ignore external VM flags
  • 2f65d40: 8268599: mark hotspot runtime/sealedClasses tests which ignore external VM flags
  • 3e1dc0a: 8268598: mark hotspot runtime/stringtable tests which ignore external VM flags
  • 58eddc8: 8268594: runtime/handshake tests don't need WhiteBox after AOT removal
  • 9f4f039: 8268596: mark hotspot runtime/verifier tests which ignore external VM flags
  • 4006fe7: 8268597: mark hotspot runtime/symboltable tests which ignore external VM flags
  • 8ccb76e: 8268601: mark hotspot runtime/records tests which ignore external VM flags
  • ... and 49 more: https://git.openjdk.java.net/jdk17/compare/e39346e708a06cdee2b9a096f08c1cfe2e21dfc2...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jun 18, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 18, 2021
@openjdk
Copy link

openjdk bot commented Jun 18, 2021

@msheppar Pushed as commit d8a0582.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
integrated Pull request has been integrated net net-dev@openjdk.java.net
4 participants