Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't announce ready until file settings are applied #92856

Merged
merged 19 commits into from Jan 18, 2023

Conversation

grcevski
Copy link
Contributor

@grcevski grcevski commented Jan 11, 2023

The current file based settings design prevents a node from starting if during startup we have discovered an invalid file settings file. The initial implementation caused a deadlock in certain situations as described here #92812.

This PR changes the startup requirement, such that not-starting is expressed as not-ready in terms of the readiness probe. Here's the updated flow:

  • Upon creation of the file based service we don't check if we are the master. We sign up for cluster state updates before the cluster state is recovered.
  • When through cluster state we are notified that we are the elected master, we simply launch the watcher thread, we don't attempt to process any file based settings.
  • We use the readiness service on the master node to avoid declaring the master as ready until it has successfully finished processing of the file based settings. This is accomplished by allowing the readiness service to subscribe to updates on applied file based settings. On a master node, the readiness service needs an additional flag of 'file settings applied' or 'no file settings' to declare the node as ready.

TODO:

  • Write test with multiple masters and re-election of a master
  • Can we propagate the file settings applied flag through the ClusterChangedEvent so that the rest of the nodes don't declare readiness until the master has applied the settings? (follow-up PR)
  • Write tests with readiness.

Closes #92812

@grcevski grcevski added >bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.6.1 v8.7.0 labels Jan 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've updated the changelog YAML for you.

} else {
stateService.process(NAMESPACE, parsedState, (e) -> completeProcessing(e, completion));
}
stateService.process(NAMESPACE, parser, (e) -> completeProcessing(e, completion));
} catch (Exception e) {
completion.completeExceptionally(e);
}

return completion;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also move this to a PlainActionFuture<Void> in order to assert that we never end up blocking an inappropriate thread?

}
} catch (ExecutionException e) {
logger.error("Error processing operator settings json file", e.getCause());
startupLatch.onFailure((Exception) e.getCause());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cluster ends up running multiple concurrent elections for the first master then it's possible that the first elected master might be usurped shortly after, which could result in an exception here on which I think we should retry rather than just shutting the node down. Use MasterService#isPublishFailureException to detect that.

});

try {
startTask.get(1, TimeUnit.MINUTES);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could result in a flaky test. By starting ES in a separate thread, we risk that the check below of whether security was auto configured or not looks at a version of the keystore and elasticsearch.yml before auto configuration actually runs.

@grcevski grcevski marked this pull request as ready for review January 16, 2023 01:26
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@grcevski
Copy link
Contributor Author

grcevski commented Jan 16, 2023

I decided to change the design of this PR a bit. The main reason we wanted block on startup is to ensure the master node (which is applying pre-existing file based settings) doesn't say it's ready, if it was given corrupt settings. However, I realized that from the view point of the readiness probe, whether the node didn't start or it never became ready are indistinguishable. Essentially, if we never bring up the readiness socket for a node that fails to apply initial file based settings, it's equivalent to if we brought the node down on startup - as seen by the readiness probe.

This equality in externally preserved behaviour allows us to make a lot safer change. Since we never block on startup, like today, no existing behaviour changes will be observed to anyone not using the file based settings and the readiness service. We don't have to worry about masters temporarily being usurped and nodes failing to start because of sudden new master election.

With this change things work just as before, except a master node isn't ready (as per the readiness probe) until it has successfully applied the initial settings.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had a quick look over the production code changes, particularly in FileSettingsService, and they look good to me. I'll leave a full review to Ryan.

this.clusterService = clusterService;
this.stateService = stateService;
this.operatorSettingsDir = environment.configFile().toAbsolutePath().resolve(OPERATOR_DIRECTORY);
this.nodeClient = nodeClient;
this.eventListeners = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is properly synchronized as written here, but there is a risk that a future reshuffle might break that. Can we use something stronger just to be sure? (a CopyOnWriteArrayList is good enough I think).

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to push. Thanks for all the additional tests.

One note: I think the commit message will need to be updated. The original description still notes causing the node to terminate on failure, which is no longer the case.

@grcevski grcevski merged commit 808ce72 into elastic:main Jan 18, 2023
@grcevski
Copy link
Contributor Author

Thanks Ryan!

@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.6 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 92856

@thecoop
Copy link
Member

thecoop commented Jan 18, 2023

💚 All backports created successfully

Status Branch Result
8.6

Questions ?

Please refer to the Backport tool documentation

thecoop pushed a commit to thecoop/elasticsearch that referenced this pull request Jan 18, 2023
Instead of failing startup on incorrect file based settings, prevent the node
from declaring readiness. This is equivalent from an outside perspective of
the readiness probe, however, it doesn't block on the cluster state update
task.

(cherry picked from commit 808ce72)

# Conflicts:
#	server/src/internalClusterTest/java/org/elasticsearch/reservedstate/service/RepositoriesFileSettingsIT.java
#	x-pack/plugin/ilm/src/internalClusterTest/java/org/elasticsearch/xpack/slm/SLMFileSettingsIT.java
#	x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsRestartIT.java
#	x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsStartupIT.java
thecoop added a commit that referenced this pull request Jan 18, 2023
Instead of failing startup on incorrect file based settings, prevent the node
from declaring readiness. This is equivalent from an outside perspective of
the readiness probe, however, it doesn't block on the cluster state update
task.

(cherry picked from commit 808ce72)

Co-authored-by: Nikola Grcevski <6207777+grcevski@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready >bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team v8.6.1 v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in FileSettingsService
5 participants