Skip to content

Conversation

@demery-pivotal
Copy link
Contributor

Make nearly all tests use AvailablePortHelper instead of AvailablePort
to obtain ports. See the rationale below.

Also: Remove unused methods from AvailablePort.

Rationale:

AvailablePort is inherently risky as a source of ports for tests.
Each "get available port" method obtains candidate port numbers from the
desired range by randomly sampling with replacement. This means that
multiple calls can return the same port number if the port is not put
into use between calls.

Some tests failed intermittently because they made multiple calls to
AvailablePort, received the same port on multiple calls, and unknowingly
attempted bind multiple sockets to the same port number, resulting in a
BindException. See GEODE-6622 for examples.

AvailablePortHelper does not have this problem. It obtains candidate
port numbers round robin. After returning an available port,
AvailablePortHelper will not return that port again in that JVM until it
has tested every other port in the range for availability.

To reduce the chance of different JVMs selecting each other's ports,
AvailablePortHelper selects a random starting point for its round robin
search in each JVM.

For distributed tests, DUnit further arranges for the
AvailablePortHelper in each JVM to start its round robin search in a
distinct place, maximally distant from the starting points of all other
JVMs. Because AvailablePort selects randomly from the full port range,
it cannot benefit from this techique.

The problems caused by AvailablePort are rare, but inevitable, with a
frequency determined by the total size of the port range. An upcoming
change will make the available port range much smaller (~400 ports
instead of the current ~10000 ports), which will greatly increase the
frequency of this problem. But the problem exists now, and results in
intermittent BindExceptions.

Make nearly all tests use AvailablePortHelper instead of AvailablePort
to obtain ports. See the rationale below.

Also: Remove unused methods from AvailablePort.

Rationale:

AvailablePort is inherently risky as a source of ports for tests.
Each "get available port" method obtains candidate port numbers from the
desired range by randomly sampling with replacement. This means that
multiple calls can return the same port number if the port is not put
into use between calls.

Some tests failed intermittently because they made multiple calls to
AvailablePort, received the same port on multiple calls, and unknowingly
attempted bind multiple sockets to the same port number, resulting in a
BindException. See GEODE-6622 for examples.

AvailablePortHelper does not have this problem. It obtains candidate
port numbers round robin. After returning an available port,
AvailablePortHelper will not return that port again in that JVM until it
has tested every other port in the range for availability.

To reduce the chance of different JVMs selecting each other's ports,
AvailablePortHelper selects a random starting point for its round robin
search in each JVM.

For distributed tests, DUnit further arranges for the
AvailablePortHelper in each JVM to start its round robin search in a
distinct place, maximally distant from the starting points of all other
JVMs. Because AvailablePort selects randomly from the full port range,
it cannot benefit from this techique.

The problems caused by AvailablePort are rare, but inevitable, with a
frequency determined by the total size of the port range. An upcoming
change will make the available port range much smaller (~400 ports
instead of the current ~10000 ports), which will greatly increase the
frequency of this problem. But the problem exists now, and results in
intermittent BindExceptions.
@demery-pivotal demery-pivotal merged commit b3867a8 into apache:develop Dec 17, 2020
@demery-pivotal demery-pivotal deleted the geode-8404/use-available-port-helper branch December 17, 2020 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants