New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't announce ready until file settings are applied #92856
Don't announce ready until file settings are applied #92856
Conversation
Hi @grcevski, I've created a changelog YAML for you. |
…asticsearch into fix/file_settings_deadlock_1
Hi @grcevski, I've updated the changelog YAML for you. |
} else { | ||
stateService.process(NAMESPACE, parsedState, (e) -> completeProcessing(e, completion)); | ||
} | ||
stateService.process(NAMESPACE, parser, (e) -> completeProcessing(e, completion)); | ||
} catch (Exception e) { | ||
completion.completeExceptionally(e); | ||
} | ||
|
||
return completion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also move this to a PlainActionFuture<Void>
in order to assert that we never end up blocking an inappropriate thread?
} | ||
} catch (ExecutionException e) { | ||
logger.error("Error processing operator settings json file", e.getCause()); | ||
startupLatch.onFailure((Exception) e.getCause()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the cluster ends up running multiple concurrent elections for the first master then it's possible that the first elected master might be usurped shortly after, which could result in an exception here on which I think we should retry rather than just shutting the node down. Use MasterService#isPublishFailureException
to detect that.
- Add tests for the retry.
}); | ||
|
||
try { | ||
startTask.get(1, TimeUnit.MINUTES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could result in a flaky test. By starting ES in a separate thread, we risk that the check below of whether security was auto configured or not looks at a version of the keystore and elasticsearch.yml before auto configuration actually runs.
- Adjust tests for new behaviour - Use variable initial timeout everywhere
Pinging @elastic/es-core-infra (Team:Core/Infra) |
I decided to change the design of this PR a bit. The main reason we wanted block on startup is to ensure the master node (which is applying pre-existing file based settings) doesn't say it's ready, if it was given corrupt settings. However, I realized that from the view point of the readiness probe, whether the node didn't start or it never became ready are indistinguishable. Essentially, if we never bring up the readiness socket for a node that fails to apply initial file based settings, it's equivalent to if we brought the node down on startup - as seen by the readiness probe. This equality in externally preserved behaviour allows us to make a lot safer change. Since we never block on startup, like today, no existing behaviour changes will be observed to anyone not using the file based settings and the readiness service. We don't have to worry about masters temporarily being usurped and nodes failing to start because of sudden new master election. With this change things work just as before, except a master node isn't ready (as per the readiness probe) until it has successfully applied the initial settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have had a quick look over the production code changes, particularly in FileSettingsService, and they look good to me. I'll leave a full review to Ryan.
this.clusterService = clusterService; | ||
this.stateService = stateService; | ||
this.operatorSettingsDir = environment.configFile().toAbsolutePath().resolve(OPERATOR_DIRECTORY); | ||
this.nodeClient = nodeClient; | ||
this.eventListeners = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is properly synchronized as written here, but there is a risk that a future reshuffle might break that. Can we use something stronger just to be sure? (a CopyOnWriteArrayList
is good enough I think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok to push. Thanks for all the additional tests.
One note: I think the commit message will need to be updated. The original description still notes causing the node to terminate on failure, which is no longer the case.
Thanks Ryan! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
Instead of failing startup on incorrect file based settings, prevent the node from declaring readiness. This is equivalent from an outside perspective of the readiness probe, however, it doesn't block on the cluster state update task. (cherry picked from commit 808ce72) # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/reservedstate/service/RepositoriesFileSettingsIT.java # x-pack/plugin/ilm/src/internalClusterTest/java/org/elasticsearch/xpack/slm/SLMFileSettingsIT.java # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsRestartIT.java # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsStartupIT.java
Instead of failing startup on incorrect file based settings, prevent the node from declaring readiness. This is equivalent from an outside perspective of the readiness probe, however, it doesn't block on the cluster state update task. (cherry picked from commit 808ce72) Co-authored-by: Nikola Grcevski <6207777+grcevski@users.noreply.github.com>
The current file based settings design prevents a node from starting if during startup we have discovered an invalid file settings file. The initial implementation caused a deadlock in certain situations as described here #92812.
This PR changes the startup requirement, such that not-starting is expressed as not-ready in terms of the readiness probe. Here's the updated flow:
ready
until it has successfully finished processing of the file based settings. This is accomplished by allowing the readiness service to subscribe to updates on applied file based settings. On a master node, the readiness service needs an additional flag of 'file settings applied' or 'no file settings' to declare the node as ready.TODO:
Closes #92812