Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message #19314

Closed
wants to merge 9 commits into from

Conversation

sendaoYan
Copy link
Member

@sendaoYan sendaoYan commented May 20, 2024

Hi all,
The test test/jdk/java/nio/channels/DatagramChannel/Loopback.java intermittent fail, the failure probability is about 2/1000.
This PR mark add some entropy(System.nanoTime) to the message content, and if the payloads doesn't match then ignore it, because payloads probably receiving some stray packet sent by another process. Only if the port unmatch and the content match we consider it a failure.
The change has been verified locally, I have runed the test after the this patch 20k times and all passed. Only change the testcase, no risk.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19314/head:pull/19314
$ git checkout pull/19314

Update a local copy of the PR:
$ git checkout pull/19314
$ git pull https://git.openjdk.org/jdk.git pull/19314/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19314

View PR using the GUI difftool:
$ git pr show -t 19314

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19314.diff

Webrev

Link to Webrev Comment

…rmittent

Signed-off-by: sendaoYan <yansendao.ysd@alibaba-inc.com>
@bridgekeeper
Copy link

bridgekeeper bot commented May 20, 2024

👋 Welcome back syan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 20, 2024

@sendaoYan This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message

Reviewed-by: dfuchs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 435 new commits pushed to the master branch:

  • 3b3a19e: 8335314: Problem list compiler/uncommontrap/DeoptReallocFailure.java
  • d457609: 8319947: Recursive lightweight locking: s390x implementation
  • c47a0e0: 8334147: Shenandoah: Avoid taking lock for disabled free set logging
  • 308a812: 8334645: Un-problemlist vmTestbase/nsk/sysdict/vm/stress/chain/chain007/chain007.java
  • b4df380: 8334763: --enable-asan: assert(_thread->is_in_live_stack((address)this)) failed: not on stack?
  • cd46c87: 8334843: RISC-V: Fix wraparound checking for r_array_index in lookup_secondary_supers_table_slow_path
  • 4e8cbf8: 8335134: Test com/sun/jdi/BreakpointOnClassPrepare.java timeout
  • 3b1ca98: 8334895: OpenJDK fails to configure on linux aarch64 when CDS is disabled after JDK-8331942
  • c35e58a: 8309634: Resolve CONSTANT_MethodRef at CDS dump time
  • 243bae7: 8304693: Remove -XX:-UseVtableBasedCHA
  • ... and 425 more: https://git.openjdk.org/jdk/compare/86eb5d9f3be30ff9df1318f18ab73c7129c978f6...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dfuch) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label May 20, 2024
@openjdk
Copy link

openjdk bot commented May 20, 2024

@sendaoYan The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio nio-dev@openjdk.org label May 20, 2024
@mlbridge
Copy link

mlbridge bot commented May 20, 2024

@AlanBateman
Copy link
Contributor

No objection to add the keyword. From a quick look at the logs then it hints that it may be a kernel issue rather than a JDK issue, e.g. here one of the failures in the logs you attached:

UNSPEC socket
join ff02:0:0:0:0:0:0:a @ veth87b01b2
set outgoing multicast interface to veth87b01b2
IP_MULTICAST_LOOP enabled
send /[0:0:0:0:0:0:0:0]:50287 -> /[ff02:0:0:0:0:0:0:a]:50287
received java.nio.HeapByteBuffer[pos=5 lim=100 cap=100] from /[fe80:0:0:0:8cf3:1bff:fe85:f8e6%5]:50287
IP_MULTICAST_LOOP disabled
send /[0:0:0:0:0:0:0:0]:50287 -> /[ff02:0:0:0:0:0:0:a]:50287
selected 1
received java.nio.HeapByteBuffer[pos=5 lim=5 cap=100] from /[fe80:0:0:0:8cf3:1bff:fe85:f8e6%5]:50287
STDERR:
java.lang.RuntimeException: Unexpected message

I assume veth87b01b2 is a virtual interface. When IP_MULTICAST_LOOP is enabled the 5-byte multicast datagram is looped as expected. When IP_MULTICAST_LOOP is disabled then it appears to be looped again. So this may require digging into the kernel to better understand this. It would be interesting to know if it's always IPv6 where the failures happen.

@sendaoYan
Copy link
Member Author

sendaoYan commented May 21, 2024

No objection to add the keyword. From a quick look at the logs then it hints that it may be a kernel issue rather than a JDK issue, e.g. here one of the failures in the logs you attached:

UNSPEC socket
join ff02:0:0:0:0:0:0:a @ veth87b01b2
set outgoing multicast interface to veth87b01b2
IP_MULTICAST_LOOP enabled
send /[0:0:0:0:0:0:0:0]:50287 -> /[ff02:0:0:0:0:0:0:a]:50287
received java.nio.HeapByteBuffer[pos=5 lim=100 cap=100] from /[fe80:0:0:0:8cf3:1bff:fe85:f8e6%5]:50287
IP_MULTICAST_LOOP disabled
send /[0:0:0:0:0:0:0:0]:50287 -> /[ff02:0:0:0:0:0:0:a]:50287
selected 1
received java.nio.HeapByteBuffer[pos=5 lim=5 cap=100] from /[fe80:0:0:0:8cf3:1bff:fe85:f8e6%5]:50287
STDERR:
java.lang.RuntimeException: Unexpected message

I assume veth87b01b2 is a virtual interface. When IP_MULTICAST_LOOP is enabled the 5-byte multicast datagram is looped as expected. When IP_MULTICAST_LOOP is disabled then it appears to be looped again. So this may require digging into the kernel to better understand this. It would be interesting to know if it's always IPv6 where the failures happen.

I reporduced this intermittent failure on a ECS, so veth87b01b2 is a virtual interface. I will close IPv6 and reproduce this failure. After I close IPv6, and this failure can not repoeduce, then maybe the failure alway happend in IPv6.

@sendaoYan
Copy link
Member Author

sendaoYan commented May 28, 2024

The GHA test runner report two failures:

  1. macos-x64 build fail, compiler clang14 crash when compile ad_x86.cpp. It's unrelated to this PR.
  2. ObjectMonitorUsage.java fails which has been recorded in JDK-8332923. It's unrelated to this PR.

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 25, 2024

@sendaoYan This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@sendaoYan
Copy link
Member Author

Hi, can anyone review this PR.

@openjdk
Copy link

openjdk bot commented Jun 26, 2024

⚠️ @sendaoYan This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@dfuch
Copy link
Member

dfuch commented Jun 26, 2024

Please give me some time to test the proposed changes.

@sendaoYan
Copy link
Member Author

Please give me some time to test the proposed changes.

Okey.

@RealCLanger
Copy link
Contributor

Hi, can you maybe refrain from adding the intermittent keyword to the test? In our test setup at SAP this would mean that the test is not run any longer.

@sendaoYan
Copy link
Member Author

sendaoYan commented Jun 27, 2024

Hi, can you maybe refrain from adding the intermittent keyword to the test? In our test setup at SAP this would mean that the test is not run any longer.

@dfuch I run this test after the this patch 20k times and all passed, maybe the @key intermittent tag not longer needed

@dfuch
Copy link
Member

dfuch commented Jun 27, 2024

@dfuch I run this test after the this patch 20k times and all passed, maybe the @key intermittent tag not longer needed

Fine to remove the intermitent label then. We will need to change the title of the PR and the JBS issue to reflect what this change is about though.

@sendaoYan
Copy link
Member Author

/issue JDK-8332535

@openjdk openjdk bot changed the title 8332535: Mark java/nio/channels/DatagramChannel/Loopback.java as intermittent 8332535: Discard any message not expect in Loopback.java Jun 27, 2024
@dfuch
Copy link
Member

dfuch commented Jun 27, 2024

/issue JDK-8332535

@openjdk
Copy link

openjdk bot commented Jun 27, 2024

@dfuch Only the author (@sendaoYan) is allowed to issue the /issue command.

@dfuch
Copy link
Member

dfuch commented Jun 27, 2024

Actually - maybe we should use JDK-8269657 for this change, since it fixes the test failure. You can grab it from @Michael-Mc-Mahon, I don't think he will mind :-)
Otherwise we'll just close it as a duplicate of JDK-8332535 - but it is strange to see an issue closed as a duplicate of its subtask.

FWIW - I have launched some repeated tests in our CI - so far this looks promising :-)

@sendaoYan
Copy link
Member Author

Actually - maybe we should use JDK-8269657 for this change, since it fixes the test failure. You can grab it from @Michael-Mc-Mahon, I don't think he will mind :-) Otherwise we'll just close it as a duplicate of JDK-8332535 - but it is strange to see an issue closed as a duplicate of its subtask.

FWIW - I have launched some repeated tests in our CI - so far this looks promising :-)

Okey.
/issue JDK-8269657

@openjdk
Copy link

openjdk bot commented Jun 27, 2024

@sendaoYan
Adding additional issue to issue list: 8269657: java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message.

@sendaoYan
Copy link
Member Author

/issue remove JDK-8332535

@openjdk
Copy link

openjdk bot commented Jun 27, 2024

@sendaoYan The issue 8332535 was not found in the list of additional solved issues. The list currently contains these issues: 8269657

@sendaoYan sendaoYan changed the title 8332535: Discard any message not expected in Loopback.java 8269657: Test Loopback.java failed: Unexpected message Jun 27, 2024
@sendaoYan
Copy link
Member Author

/issue JDK-8269657

@openjdk openjdk bot changed the title 8269657: Test Loopback.java failed: Unexpected message 8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message Jun 28, 2024
@openjdk
Copy link

openjdk bot commented Jun 28, 2024

@sendaoYan This issue is referenced in the PR title - it will now be updated.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 28, 2024
@sendaoYan
Copy link
Member Author

Thanks for the patient guidance and review.
/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 28, 2024
@openjdk
Copy link

openjdk bot commented Jun 28, 2024

@sendaoYan
Your change (at version fb5bdc2) is now ready to be sponsored by a Committer.

@dfuch
Copy link
Member

dfuch commented Jun 28, 2024

Hi @sendaoYan - thanks for integrating all the feedback. I can sponsor this change - unless you already have a sponsor?

@sendaoYan
Copy link
Member Author

Hi @sendaoYan - thanks for integrating all the feedbcak. I can sponsor this change - unless you already have a sponsor?

No, I need your sponsor, thanks.

@dfuch
Copy link
Member

dfuch commented Jun 28, 2024

/sponsor

@openjdk
Copy link

openjdk bot commented Jun 28, 2024

Going to push as commit c798316.
Since your change was applied there have been 437 commits pushed to the master branch:

  • 99d2bbf: 8334433: jshell.exe runs an executable test.exe on startup
  • 6f4ddc2: 8335142: compiler/c1/TestTraceLinearScanLevel.java occasionally times out with -Xcomp
  • 3b3a19e: 8335314: Problem list compiler/uncommontrap/DeoptReallocFailure.java
  • d457609: 8319947: Recursive lightweight locking: s390x implementation
  • c47a0e0: 8334147: Shenandoah: Avoid taking lock for disabled free set logging
  • 308a812: 8334645: Un-problemlist vmTestbase/nsk/sysdict/vm/stress/chain/chain007/chain007.java
  • b4df380: 8334763: --enable-asan: assert(_thread->is_in_live_stack((address)this)) failed: not on stack?
  • cd46c87: 8334843: RISC-V: Fix wraparound checking for r_array_index in lookup_secondary_supers_table_slow_path
  • 4e8cbf8: 8335134: Test com/sun/jdi/BreakpointOnClassPrepare.java timeout
  • 3b1ca98: 8334895: OpenJDK fails to configure on linux aarch64 when CDS is disabled after JDK-8331942
  • ... and 427 more: https://git.openjdk.org/jdk/compare/86eb5d9f3be30ff9df1318f18ab73c7129c978f6...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 28, 2024
@openjdk openjdk bot closed this Jun 28, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jun 28, 2024
@openjdk
Copy link

openjdk bot commented Jun 28, 2024

@dfuch @sendaoYan Pushed as commit c798316.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@sendaoYan sendaoYan deleted the jbs8332535 branch June 28, 2024 09:44
@sendaoYan
Copy link
Member Author

Thanks for the sponsor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

4 participants