Skip to content

Conversation

@demery-pivotal
Copy link
Contributor

AvailablePort and AvailablePortHelper have methods to retrieve an
"available" port from the membership port range. This feature is both
unnecessary and inherently unreliable.

INHERENTLY UNRELIABLE

The default membership port range overlaps each OS's ephemeral port
range. An ephemeral port deemed to be "available" by one of these
methods can be put into use by another process even before the method
returns. It is not safe to rely on such a port to remain available after
it is checked.

UNNECESSARY

Currently the only callers that request ports from the membership port
range are two tests in DistributedSystemDUnitTest. In each case, the
test uses the returned ports to set the membership port range for a
member. There is no need for the ports to come from any particular
range.

There are no other uses of this feature, either in the product or in
tests.

SOLUTION

Remove the "membership port range" feature from AvailablePort and
AvailablePortHelper.

Change DistributedSystemDUnitTest to get available ports from the
"available port range." This range is safe to use (assuming all other
concurrently-running tests are well-behaved).

`AvailablePort` and `AvailablePortHelper` have methods to retrieve an
"available" port from the membership port range. This feature is both
unnecessary and inherently unreliable.

INHERENTLY UNRELIABLE

The default membership port range overlaps each OS's ephemeral port
range. An ephemeral port deemed to be "available" by one of these
methods can be put into use by another process even before the method
returns. It is not safe to rely on such a port to remain available after
it is checked.

UNNECESSARY

Currently the only callers that request ports from the membership port
range are two tests in `DistributedSystemDUnitTest`. In each case, the
test uses the returned ports to *set* the membership port range for a
member. There is no need for the ports to come from any particular
range.

There are no other uses of this feature, either in the product or in
tests.

SOLUTION

Remove the "membership port range" feature from `AvailablePort` and
`AvailablePortHelper`.

Change `DistributedSystemDUnitTest` to get available ports from the
"available port range." This range is safe to use (assuming all other
concurrently-running tests are well-behaved).
@demery-pivotal demery-pivotal added windows add this label to get Windows JDK11 PR checks too jdk8 add this label to get Linux JDK8 PR checks too labels Oct 12, 2021
@demery-pivotal demery-pivotal marked this pull request as ready for review October 13, 2021 16:25
@demery-pivotal
Copy link
Contributor Author

I'm confident that the distributed test failure in QueryConfigurationServiceConstraintsDistributedTest is unrelated to my changes. The AuthenticationRequiredException's initial cause has a stack trace similar to the one in https://issues.apache.org/jira/browse/GEODE-9689.

* @throws IllegalArgumentException <code>protocol</code> is unknown
*/
public static int getRandomAvailablePort(int protocol, InetAddress addr) {
return getRandomAvailablePort(protocol, addr, false);
Copy link
Contributor

@Bill Bill Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't getRandomAvailablePort(int protocol) be removed?

I think the only use of this method is in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, in another PR.

@Bill
Copy link
Contributor

Bill commented Oct 14, 2021

What about AvailablePort.main()? Is that entrypoint used anywhere?

@Bill
Copy link
Contributor

Bill commented Oct 14, 2021

I understand your "INHERENTLY UNRELIABLE" analysis. What I don't understand is: why does that argument apply only to ports in the membership port range? Seems to me that any method that purports to return an "available port range" is misguided. What am I missing here?

@demery-pivotal
Copy link
Contributor Author

I understand your "INHERENTLY UNRELIABLE" analysis. What I don't understand is: why does that argument apply only to ports in the membership port range? Seems to me that any method that purports to return an "available port range" is misguided. What am I missing here?

The key is that the membership port range overlaps with each operating system's ephemeral port range. The operating system is free to give those ports to anyone who asks for an ephemeral port. Many tests launch members in a way that uses ephemeral ports, making the risk significantly higher in CI than in normal use.

A port in any other range (such as the "available port" range) is likely to be put into use only if a caller specifically asks for that port. We have a fairly strong convention to use those ports only after checking their availability. And the way tests check availability significantly minimizes the risk of conflict. The risk is minimized by the build system allocating a distinct range of ports to each concurrently-executing test JVM, and by AvailablePortHelper allocating those ports round-robin instead of randomly.

@demery-pivotal
Copy link
Contributor Author

What about AvailablePort.main()? Is that entrypoint used anywhere?

My understanding (perhaps out of date) is that some developers uses it for unknown purposes.

@Bill
Copy link
Contributor

Bill commented Oct 14, 2021

Ah so whereas the test in question, for no apparent reason, used to try make membership use a port range within the default membership port range (which also happened to lie in the ephemeral port range), with this PR the test has membership use ports from from the (non-ephemeral) port range specified by our test-running infrastructure.

Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one @dhemery !

Approved ✓

@demery-pivotal demery-pivotal merged commit 4a6b917 into apache:develop Oct 14, 2021
@demery-pivotal demery-pivotal deleted the geode-9725/available-port branch October 14, 2021 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jdk8 add this label to get Linux JDK8 PR checks too windows add this label to get Windows JDK11 PR checks too

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants