Don't announce ready until file settings are applied #92856

grcevski · 2023-01-11T23:34:18Z

The current file based settings design prevents a node from starting if during startup we have discovered an invalid file settings file. The initial implementation caused a deadlock in certain situations as described here #92812.

This PR changes the startup requirement, such that not-starting is expressed as not-ready in terms of the readiness probe. Here's the updated flow:

Upon creation of the file based service we don't check if we are the master. We sign up for cluster state updates before the cluster state is recovered.
When through cluster state we are notified that we are the elected master, we simply launch the watcher thread, we don't attempt to process any file based settings.
We use the readiness service on the master node to avoid declaring the master as ready until it has successfully finished processing of the file based settings. This is accomplished by allowing the readiness service to subscribe to updates on applied file based settings. On a master node, the readiness service needs an additional flag of 'file settings applied' or 'no file settings' to declare the node as ready.

TODO:

Write test with multiple masters and re-election of a master
Can we propagate the file settings applied flag through the ClusterChangedEvent so that the rest of the nodes don't declare readiness until the master has applied the settings? (follow-up PR)
Write tests with readiness.

Closes #92812

elasticsearchmachine · 2023-01-11T23:34:45Z

Hi @grcevski, I've created a changelog YAML for you.

…asticsearch into fix/file_settings_deadlock_1

elasticsearchmachine · 2023-01-12T20:43:52Z

Hi @grcevski, I've updated the changelog YAML for you.

DaveCTurner · 2023-01-12T21:21:05Z

server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java

-            } else {
-                stateService.process(NAMESPACE, parsedState, (e) -> completeProcessing(e, completion));
-            }
+            stateService.process(NAMESPACE, parser, (e) -> completeProcessing(e, completion));
        } catch (Exception e) {
            completion.completeExceptionally(e);
        }

        return completion;


Could we also move this to a PlainActionFuture<Void> in order to assert that we never end up blocking an inappropriate thread?

DaveCTurner · 2023-01-12T21:26:41Z

server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java

+            }
+        } catch (ExecutionException e) {
+            logger.error("Error processing operator settings json file", e.getCause());
+            startupLatch.onFailure((Exception) e.getCause());


If the cluster ends up running multiple concurrent elections for the first master then it's possible that the first elected master might be usurped shortly after, which could result in an exception here on which I think we should retry rather than just shutting the node down. Use MasterService#isPublishFailureException to detect that.

- Add tests for the retry.

rjernst · 2023-01-13T21:46:35Z

qa/os/src/test/java/org/elasticsearch/packaging/test/ArchiveTests.java

+        });
+
+        try {
+            startTask.get(1, TimeUnit.MINUTES);


I think this could result in a flaky test. By starting ES in a separate thread, we risk that the check below of whether security was auto configured or not looks at a version of the keystore and elasticsearch.yml before auto configuration actually runs.

- Adjust tests for new behaviour - Use variable initial timeout everywhere

elasticsearchmachine · 2023-01-16T01:27:08Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

grcevski · 2023-01-16T01:33:45Z

I decided to change the design of this PR a bit. The main reason we wanted block on startup is to ensure the master node (which is applying pre-existing file based settings) doesn't say it's ready, if it was given corrupt settings. However, I realized that from the view point of the readiness probe, whether the node didn't start or it never became ready are indistinguishable. Essentially, if we never bring up the readiness socket for a node that fails to apply initial file based settings, it's equivalent to if we brought the node down on startup - as seen by the readiness probe.

This equality in externally preserved behaviour allows us to make a lot safer change. Since we never block on startup, like today, no existing behaviour changes will be observed to anyone not using the file based settings and the readiness service. We don't have to worry about masters temporarily being usurped and nodes failing to start because of sudden new master election.

With this change things work just as before, except a master node isn't ready (as per the readiness probe) until it has successfully applied the initial settings.

DaveCTurner

I have had a quick look over the production code changes, particularly in FileSettingsService, and they look good to me. I'll leave a full review to Ryan.

DaveCTurner · 2023-01-16T15:44:10Z

server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java

        this.clusterService = clusterService;
        this.stateService = stateService;
        this.operatorSettingsDir = environment.configFile().toAbsolutePath().resolve(OPERATOR_DIRECTORY);
-        this.nodeClient = nodeClient;
+        this.eventListeners = new ArrayList<>();


I think this is properly synchronized as written here, but there is a risk that a future reshuffle might break that. Can we use something stronger just to be sure? (a CopyOnWriteArrayList is good enough I think).

rjernst

This looks ok to push. Thanks for all the additional tests.

One note: I think the commit message will need to be updated. The original description still notes causing the node to terminate on failure, which is no longer the case.

grcevski · 2023-01-18T00:13:30Z

Thanks Ryan!

elasticsearchmachine · 2023-01-18T00:14:23Z

💔 Backport failed

Status	Branch	Result
❌	8.6	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 92856

thecoop · 2023-01-18T09:39:14Z

💚 All backports created successfully

Status	Branch	Result
✅	8.6

Questions ?

Please refer to the Backport tool documentation

Instead of failing startup on incorrect file based settings, prevent the node from declaring readiness. This is equivalent from an outside perspective of the readiness probe, however, it doesn't block on the cluster state update task. (cherry picked from commit 808ce72) # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/reservedstate/service/RepositoriesFileSettingsIT.java # x-pack/plugin/ilm/src/internalClusterTest/java/org/elasticsearch/xpack/slm/SLMFileSettingsIT.java # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsRestartIT.java # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/FileSettingsRoleMappingsStartupIT.java

Instead of failing startup on incorrect file based settings, prevent the node from declaring readiness. This is equivalent from an outside perspective of the readiness probe, however, it doesn't block on the cluster state update task. (cherry picked from commit 808ce72) Co-authored-by: Nikola Grcevski <6207777+grcevski@users.noreply.github.com>

Don't announce ready until file settings are applied

f47f239

grcevski added >bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.6.1 v8.7.0 labels Jan 11, 2023

Update docs/changelog/92856.yaml

82d2d76

Nikola Grcevski added 3 commits January 11, 2023 18:48

Make readiness listen to file settings changes.

f0a777e

Merge branch 'fix/file_settings_deadlock_1' of github.com:grcevski/el…

9394939

…asticsearch into fix/file_settings_deadlock_1

Update to wait on start.

45005a5

Update docs/changelog/92856.yaml

a0040b8

DaveCTurner reviewed Jan 12, 2023

View reviewed changes

Nikola Grcevski added 7 commits January 12, 2023 16:50

Merge branch 'main' into fix/file_settings_deadlock_1

48b1273

Fix few exit conditions

78db695

Fix voting only node and test.

ae6aacd

Add timeout to test.

0699eec

Add master re-election test

7f2eb4a

Add process file setting retry.

692d99b

- Use PlainActionFuture instead of CompletableFuture.

f306938

- Add tests for the retry.

rjernst reviewed Jan 13, 2023

View reviewed changes

DaveCTurner mentioned this pull request Jan 13, 2023

Randomize initial state timeout #92817

Merged

Nikola Grcevski added 4 commits January 15, 2023 14:59

Revert test change

25c961e

Don't wait on start, readiness is sufficient.

10a7863

Merge branch 'main' into fix/file_settings_deadlock_1

cbd33a2

- Readiness and file settings test

6b4d1ad

- Adjust tests for new behaviour - Use variable initial timeout everywhere

grcevski marked this pull request as ready for review January 16, 2023 01:26

Fix test for race condition.

1aa8115

DaveCTurner reviewed Jan 16, 2023

View reviewed changes

Change to CopyOnWriteArrayList

a241145

rjernst approved these changes Jan 17, 2023

View reviewed changes

grcevski merged commit 808ce72 into elastic:main Jan 18, 2023

elasticsearchmachine added the backport pending label Jan 18, 2023

thecoop mentioned this pull request Jan 18, 2023

[8.6] Don't announce ready until file settings are applied (#92856) #93024

Merged

thecoop removed the backport pending label Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't announce ready until file settings are applied #92856

Don't announce ready until file settings are applied #92856

grcevski commented Jan 11, 2023 •

edited

elasticsearchmachine commented Jan 11, 2023

elasticsearchmachine commented Jan 12, 2023

DaveCTurner Jan 12, 2023

DaveCTurner Jan 12, 2023

rjernst Jan 13, 2023

elasticsearchmachine commented Jan 16, 2023

grcevski commented Jan 16, 2023 •

edited

DaveCTurner left a comment

DaveCTurner Jan 16, 2023

rjernst left a comment

grcevski commented Jan 18, 2023

elasticsearchmachine commented Jan 18, 2023

thecoop commented Jan 18, 2023

Don't announce ready until file settings are applied #92856

Don't announce ready until file settings are applied #92856

Conversation

grcevski commented Jan 11, 2023 • edited

elasticsearchmachine commented Jan 11, 2023

elasticsearchmachine commented Jan 12, 2023

DaveCTurner Jan 12, 2023

Choose a reason for hiding this comment

DaveCTurner Jan 12, 2023

Choose a reason for hiding this comment

rjernst Jan 13, 2023

Choose a reason for hiding this comment

elasticsearchmachine commented Jan 16, 2023

grcevski commented Jan 16, 2023 • edited

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner Jan 16, 2023

Choose a reason for hiding this comment

rjernst left a comment

Choose a reason for hiding this comment

grcevski commented Jan 18, 2023

elasticsearchmachine commented Jan 18, 2023

💔 Backport failed

thecoop commented Jan 18, 2023

💚 All backports created successfully

Questions ?

grcevski commented Jan 11, 2023 •

edited

grcevski commented Jan 16, 2023 •

edited