Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make NodeConnectionsService#connectToNodes deterministic #94949

Conversation

DaveCTurner
Copy link
Contributor

Today the CoordinatorTests test suite is not totally deterministic because its behaviour depends on the iteration order of the JDK's unordered collections which are not under the control of our test seed.

This commit makes NodeConnectionsService#connectToNodes iterate through the nodes using DiscoveryNodes#mastersFirstStream, which is made deterministic by #94948. It's an ugly hack to do some extra work in production just for the sake of tests, but we're only sorting at most a few hundred elements here so it's not a huge deal.

Relates #94946

Today the `CoordinatorTests` test suite is not totally deterministic
because its behaviour depends on the iteration order of the JDK's
unordered collections which are not under the control of our test seed.

This commit makes `NodeConnectionsService#connectToNodes` iterate
through the nodes using `DiscoveryNodes#mastersFirstStream`, which is
made deterministic by elastic#94948. It's an ugly hack to do some extra work in
production just for the sake of tests, but we're only sorting at most a
few hundred elements here so it's not a huge deal.

Relates elastic#94946
@DaveCTurner DaveCTurner added >non-issue :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.8.0 labels Mar 31, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Mar 31, 2023
Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@arteam arteam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I've left a small suggestion

@@ -99,7 +100,9 @@ public void connectToNodes(DiscoveryNodes discoveryNodes, Runnable onCompletion)
final List<Runnable> runnables = new ArrayList<>(discoveryNodes.getSize());
try (var refs = new RefCountingRunnable(onCompletion)) {
synchronized (mutex) {
for (final DiscoveryNode discoveryNode : discoveryNodes) {
// Ugly hack: when https://github.com/elastic/elasticsearch/issues/94946 is fixed, just iterate over discoveryNodes here
for (final Iterator<DiscoveryNode> iterator = discoveryNodes.mastersFirstStream().iterator(); iterator.hasNext();) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use this pattern in multiple places, what do you think about creating a helper method that yields an iterable object that represents master first sorted discoveryNodes?

public Iterable<DiscoveryNode> mastersFirst() {
        return mastersFirstStream()::iterator;
}

Then we can continue using for loops as

for (final DiscoveryNode discoveryNode : discoveryNodes.mastersFirst()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yes, that'd make sense. I think it's just here for now so I'll leave it as it is to avoid adding too much more cruft to the DiscoveryNodes API, but I will do this if I encounter any more of these.

@DaveCTurner
Copy link
Contributor Author

@elasticmachine update branch

@DaveCTurner DaveCTurner added the auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 31, 2023
@DaveCTurner
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/part-2

@elasticsearchmachine elasticsearchmachine merged commit ff2591d into elastic:main Mar 31, 2023
@DaveCTurner DaveCTurner deleted the 2023-03-31-NodeConnectionsService-determinism branch March 31, 2023 14:25
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Apr 6, 2023
…4949)

Today the `CoordinatorTests` test suite is not totally deterministic
because its behaviour depends on the iteration order of the JDK's
unordered collections which are not under the control of our test seed.

This commit makes `NodeConnectionsService#connectToNodes` iterate
through the nodes using `DiscoveryNodes#mastersFirstStream`, which is
made deterministic by elastic#94948. It's an ugly hack to do some extra work in
production just for the sake of tests, but we're only sorting at most a
few hundred elements here so it's not a huge deal.

Relates elastic#94946
saarikabhasi pushed a commit to saarikabhasi/elasticsearch that referenced this pull request Apr 10, 2023
…4949)

Today the `CoordinatorTests` test suite is not totally deterministic
because its behaviour depends on the iteration order of the JDK's
unordered collections which are not under the control of our test seed.

This commit makes `NodeConnectionsService#connectToNodes` iterate
through the nodes using `DiscoveryNodes#mastersFirstStream`, which is
made deterministic by elastic#94948. It's an ugly hack to do some extra work in
production just for the sake of tests, but we're only sorting at most a
few hundred elements here so it's not a huge deal.

Relates elastic#94946
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >non-issue Team:Distributed Meta label for distributed team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants