Skip to content

Conversation

@kevinjwalls
Copy link
Contributor

@kevinjwalls kevinjwalls commented Feb 23, 2022

Test fails occasionally due to a port clash.
Presumably the port that was returned by Utils.getFreePort(), is no longer free.
The test creates a ProcessBuilder with the parameters for JMX, including port number, and uses that to create a new Process.
It should retry with a new port if we fail due to a port in use, for some limited number of attempts.

main already has some retry logic, but not working:
it checks for an InvocationTargetException to contain a BindException, but it simply gets a BindException, thrown by TestAppRun.start().
TestAppRun.start() runs the new process and scans for errors, but on failure its predicate has only seen the first line of a failure, so a BindException is never recognised and thrown.
Also main does not limit the retries, and handling the port retry in main() is duplicated, for each run of the test method.

So...

Make the error-scanning predicate in TestAppRun recognise a "port in use" message and throw a BindExeption. This is a notification to the caller that it failed, it's not the actual BindException as that was thrown in a different process.
Make the testDefaultAgent method (the main part of the test) handle retrying with a new port, a limited number of times.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8206187: javax/management/remote/mandatory/connection/DefaultAgentFilterTest.java fails with Port already in use

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7589/head:pull/7589
$ git checkout pull/7589

Update a local copy of the PR:
$ git checkout pull/7589
$ git pull https://git.openjdk.java.net/jdk pull/7589/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7589

View PR using the GUI difftool:
$ git pr show -t 7589

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7589.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 23, 2022

👋 Welcome back kevinw! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 23, 2022

@kevinjwalls The following labels will be automatically applied to this pull request:

  • jmx
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org jmx jmx-dev@openjdk.org labels Feb 23, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 23, 2022
@mlbridge
Copy link

mlbridge bot commented Feb 23, 2022

Webrevs

@msheppar
Copy link

This is reasonable approach to take and should ameliorate the BindException failure condition.
But, as it is using the same port allocation strategy which is in current use, then it will be still be
suspect to intermittent BindException failures. This will be especially true, as with the current failures,
there is a lot of network testing activity with concurrent network test execution and the MACH5 test framework itself e.g. logstash etc.

An alternative is to use a fixed port 1098, which previously was the default for RMI Activation daemon.
This is now available as activation has been removed from the java platform in JDK17.

@alexmenkov
Copy link

The test uses warm-up predicate in a strange way - it returns true for any output from child process.
The main purpose of the predicate is to wait until some expected output is produced, and return false to skip other lines (for example in some environments VM may print some warnings).
For this test it may be "main enter" (printed from TestApp.main()).
To handle "port in use" error the predicate can search for "bindexception" or "port already in use" line.
I don't think check for "exception" and "error" makes much sense here.

@kevinjwalls
Copy link
Contributor Author

An alternative is to use a fixed port 1098

Thanks Mark - I will avoid that fixed slot as we no doubt run tests concurrently, and also in case these get backported far enough that it's not free. 8-)

Utils.getFreePort() lets new ServerSocket choose a port, but there's clearly a race to use it. We could make it simply random, but I think we need still need to retry like this to avoid failures -- will monitor and see how the race goes with 10 attempts.

@kevinjwalls
Copy link
Contributor Author

The test uses warm-up predicate in a strange way

Thanks Alex - yes so that's why we only see one line in the predicate, it contains the word "exception" and the predicate returns true, signalling that the process is done starting up. 8-)
But if you do see an exception, you're never going to see the text from the app that it has started OK, so I see why it was returning true always - we were either seeing an error, or otherwise we presume TestApp has started.
But it returned true after seeing the first line, which does not contain "bindexception", so it never saw the BindException later to recognise the port being in use.

Anyway...
I will update shortly to hopefully clarify the predicate part. Still good to keep main() simpler and not have duplicated retry logic there.

private static final String TEST_APP_NAME = "TestApp";
private static final int FREE_PORT_ATTEMPTS = 10;

private static void testDefaultAgent(String propertyFile) throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good refactoring change as it also removes some broken logic in the current retry strategy

@openjdk
Copy link

openjdk bot commented Feb 24, 2022

@kevinjwalls This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8206187: javax/management/remote/mandatory/connection/DefaultAgentFilterTest.java fails with Port already in use

Reviewed-by: msheppar, amenkov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 23 new commits pushed to the master branch:

  • bf19fc6: 8280357: user.home = "?" when running with systemd DynamicUser=true
  • b6843a1: 8005885: enhance PrintCodeCache to print more data
  • 23995f8: 8281525: Enable Zc:strictStrings flag in Visual Studio build
  • 20e78f7: 8282307: Parallel: Incorrect discovery mode in PCReferenceProcessor
  • 0b6862e: 8282348: Remove unused CardTable::dirty_card_iterate
  • 6fab8a2: 8277204: Implement PAC-RET branch protection on Linux/AArch64
  • abc0ce1: 8282316: Operation before String case conversion
  • 0796620: 8281944: JavaDoc throws java.lang.IllegalStateException: ERRONEOUS
  • 231e48f: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace after JDK-8280422
  • f4486a1: 8262400: runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java fails in test_ame5_compiled_vtable_stub with wrapper
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/5035bf5e6cb0ae2892e128b9a7c4014d01addb26...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 24, 2022
@kevinjwalls
Copy link
Contributor Author

Thanks for the reviews!

Apologies, one more update - I had a break in there earlier to terminate the retry loop, and I removed it accidentally when changing things to make it fail.
The retry loop has to let the testDefaultAgent method complete normally (which needs the break), or by throwing an Exception.
Both looking good now.

@msheppar
Copy link

A good spot by yourself ... as the focus was on the BindException correction and the pass condition is an Exception being thrown, it was easy to miss the break for a failure condition ... a similar issue existed in the current test albeit only if an InvocationTargetException with a nested BindException was thrown and the retry resulted in the
InvalidClassException or UnmarcshalledException !!

@kevinjwalls
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Feb 25, 2022

Going to push as commit cd36be4.
Since your change was applied there have been 23 commits pushed to the master branch:

  • bf19fc6: 8280357: user.home = "?" when running with systemd DynamicUser=true
  • b6843a1: 8005885: enhance PrintCodeCache to print more data
  • 23995f8: 8281525: Enable Zc:strictStrings flag in Visual Studio build
  • 20e78f7: 8282307: Parallel: Incorrect discovery mode in PCReferenceProcessor
  • 0b6862e: 8282348: Remove unused CardTable::dirty_card_iterate
  • 6fab8a2: 8277204: Implement PAC-RET branch protection on Linux/AArch64
  • abc0ce1: 8282316: Operation before String case conversion
  • 0796620: 8281944: JavaDoc throws java.lang.IllegalStateException: ERRONEOUS
  • 231e48f: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace after JDK-8280422
  • f4486a1: 8262400: runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java fails in test_ame5_compiled_vtable_stub with wrapper
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/5035bf5e6cb0ae2892e128b9a7c4014d01addb26...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 25, 2022
@openjdk openjdk bot closed this Feb 25, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 25, 2022
@openjdk
Copy link

openjdk bot commented Feb 25, 2022

@kevinjwalls Pushed as commit cd36be4.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated jmx jmx-dev@openjdk.org serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

3 participants