-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8030616: sun/management/jmxremote/bootstrap/RmiBootstrapTest fails intermittently with cannot find a free port #10322
Conversation
…termittently with cannot find a free port
👋 Welcome back jpai! A progress list of the required criteria for merging this PR into |
@jaikiran The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
While the change is reasonable, I’m not sure about your rationale. This change doesn’t alter this race condition. The are a number of significant attributes to the port allocation strategy: The objective with using SO_REUSEADDR on a serversocket is allow for efficient server restarts due to Thus, this has a number of favourable consequences with your proposed change, as the address binding is the As such, the change may reduce the rate of intermittent failure, but it won’t solve the issue fully. Nonetheless, it is a reasonable change, provided there is no subtle change in test semantics. I think it |
Hello Mark, this specific change is merely meant to address the specific test case which is running into this problem. As you note the free port identification logic has and will continue to have the race condition. It will be applicable to all tests that use that utility. This change doesn't propose to do anything for such cases. However, in this specific case it wasn't the race condition which was causing the issue. Data that was collected from those failures shows that the issue was because of using the wrong address while trying to bind the free port that was identified. The data collected from these failed runs has been added to the linked JBS issue as a comment to show the port usage. |
@jaikiran This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Please keep open. Waiting for reviews on this one. Mark and me ran some experiments related to this issue and we did conclude that this change will help the issue at hand. @msheppar, please correct me if that's not an accurate summary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks okay to me.
Thanks,
Serguei
@jaikiran This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 8 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
Hello Serguei, Thank you for your review. I was about to integrate this when I just noticed that I had unintentionally included a new empty file in this commit. I've now updated this PR to remove that stray file (and no other changes). Could you please review the current state of this PR once more? |
Two reviews are required in the Serviceability area. |
Thank you again Serguei for the review.
I wasn't aware of that. Will certainly wait. I have triggered some tests in our internal CI system too to make sure the latest runs too are clean. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think we found the change will be benefial
Thank you Mark for the review. The CI tests for this came back fine. |
/integrate |
Going to push as commit 8b010e0.
Your commit was automatically rebased without conflicts. |
Can I please get a review of this test only change which proposes to fix the recent intermittent failures in
RmiBootstrapTest
reported in https://bugs.openjdk.org/browse/JDK-8030616?The test has been intermittently failing with
cannot find a free port after 10 tries
. The tests uses thejdk.test.lib.Utils.getFreePort()
utility method to get a free port to use in the tests. That port is then used to create and bind a JMX connector server. The issue resides in the fact that thegetFreePort
utility uses loopback address to identify a free port, whereas the JMX connector server uses that identified port to bind to a non-loopback address - there's logic insun.rmi.transport.tcp.TCPEndpoint
line 117 which callsInetAddress.getLocalHost()
which can and does return a non-loopback address. This effectively means that the port that was identified as free (on loopback) may not really be free on (some other address) and a subsequent attempt to bind against it by the connector server will end up with aBindException
.Data collected in failures on the CI system has shown that this is indeed the case and the port that was chosen (for loopback) as free was already used by another process on a different address. The test additionally has logic to attempt retries upto a maximum of 10 times to find a new free port on such
BindException
. Turns out the next free port (on loopback) returned byjdk.test.lib.Utils.getFreePort()
is incremental and it so happens that the systems where this failed had a process listening on a range of 10 to 12 ports, so these too ended up withBindException
when the JMX connector server used that port for a different address.The commit here makes sure that the JMX connector server in the test is now configured to loopback address as the address to bind to. That way the test uses the correct address (loopback) on which the port was free.
The commit also has a change to the javadoc of the test utility
jdk.test.lib.Utils.getFreePort()
to clarify that it returns a free port on loopback address. This now matches what the implementation of that method does.Multiple test runs after this change hasn't yet run into the failure.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10322/head:pull/10322
$ git checkout pull/10322
Update a local copy of the PR:
$ git checkout pull/10322
$ git pull https://git.openjdk.org/jdk pull/10322/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10322
View PR using the GUI difftool:
$ git pr show -t 10322
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10322.diff