Batch rollover cluster state updates. #79945

martijnvg · 2021-10-27T18:37:33Z

In cases where many indices are managed by ILM,
it is likely that rollovers for different indices or data streams
happen concurrently. This change allows the cluster state updates
that these rollovers generate to be batch.

This change also changes the rollover service to not do a reroute and
instead perform a single reroute for multiple batched rollovers.

Relates to #77466
Closes to #79782

In cases where many indices are managed by ILM, it is likely that rollovers for different indices or data streams happen concurrently. This change allows the cluster state updates that these rollovers generate to be batch. This change also changes the rollover service to not do a reroute and instead perform a single reroute for multiple batched rollovers. Relates to elastic#77466 Closes to elastic#79782

elasticmachine · 2021-10-28T13:06:08Z

Pinging @elastic/es-data-management (Team:Data Management)

original-brownbear · 2021-11-02T19:17:28Z

Just a quick update here from our offline discussion on it earlier:

This PR seems to work correctly. The remaining problem with it is that running a large number of index creates in a loop even without the reroute in the mix can take a very long time (as in many minutes) due to the mapping validation etc. via the temporary index services that we do. We are looking into a fix for this that avoids validating the same mapping over and over. => No need to review this yet.

UPDATE:

after further discussion and thought we decide the above is an acceptable trade-off in the short term. This change prevents a massive queue of rollovers breaking things like snapshotting for an extended period of time and saves a potentially significant amount of (master-)work in the cluster overall. The cases where this would run up a multi-minute task on the master are terminally broken without this change so it's an exclusive move in the right direction we believe and this is good for review now.

original-brownbear

Thanks @martijnvg this looks really good. I have a few small question, one thing that needs fixing and one suggestion :) But this looks pretty close.

original-brownbear · 2021-11-04T11:24:41Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

+                }
+            }
+            String reason = "bulk rollover ["
+                + tasks.stream().map(t -> t.sourceIndex.get() + "->" + t.rolloverIndex.get()).collect(Collectors.joining())


We gotta limit the output length here. See #79443 which you can probably reuse here easily.

Fixed: cc0aeb9

original-brownbear · 2021-11-04T11:25:30Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

+            String reason = "bulk rollover ["
+                + tasks.stream().map(t -> t.sourceIndex.get() + "->" + t.rolloverIndex.get()).collect(Collectors.joining())
+                + "]";
+            state = allocationService.reroute(state, reason);


Can we have a situation where the rollovers were noops? Maybe check for an unchanged state before triggering reroute?

fixed: 7fadeb8

original-brownbear · 2021-11-04T11:28:13Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

+        public void clusterStateProcessed(String source, ClusterState oldState, ClusterState newState) {
+            // Now assuming we have a new state and the name of the rolled over index, we need to wait for the
+            // configured number of active shards, as well as return the names of the indices that were rolled/created
+            if (conditionsMet.get()) {


Can we have a noop task here by any chance and have both states equal, do we have to account for that somehow? It seems no, but just double checking :)

I think this be may be true if something changed between the trail rollover and actual rollover, that makes a condition go from true to false. The if statement that I will add for the previous comment, will catch this, if it happens.

original-brownbear · 2021-11-04T11:30:24Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

+            ClusterState state = currentState;
+            for (RolloverTask task : tasks) {
+                try {
+                    state = task.performRollover(state);


I do wonder, aren't we just using the Metadata here both as input and as we do the reroute later. Maybe we can get away with just loop over building Metadata which would be slightly cheaper? :)

Eventually down in MetadataCreateIndexService#clusterStateCreateIndex(...) we do update the routing table (adding unassigned shards iirc). This gets invoked from the MetadataRolloverService. So I don't think this is possible?

Yea this would require a little extra effort to do :) When I last profiled this it wasn't a big deal though, we can look into that kind of optimization later I supoose.

original-brownbear · 2021-11-04T14:42:36Z

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateIndexService.java

@@ -478,7 +478,14 @@ private ClusterState applyCreateIndexWithTemporaryService(
            );

            indexService.getIndexEventListener().beforeIndexAddedToCluster(indexMetadata.getIndex(), indexMetadata.getSettings());
-            return clusterStateCreateIndex(currentState, request.blocks(), indexMetadata, allocationService::reroute, metadataTransformer);
+            BiFunction<ClusterState, String, ClusterState> rerouteFunction = (current, reason) -> {


Just realized this: we never mutate the request later. So we could just do:

rerouteFunction = request.performReroute() ? allocationService::reroute : (cs, reason) -> cs;

right?

done: 51077a9

original-brownbear

LGTM, one cleanup ask that'd be nice to resolve only :)

Also, this shouldn't go into 7.16 right? The state of settings+mappings dedup. in those versions doesn't really make it worth it at the scales that'll work with 7.16 and it's not a 100% risk-free change here obviously :)

In cases where many indices are managed by ILM, it is likely that rollovers for different indices or data streams happen concurrently. This change allows the cluster state updates that these rollovers generate to be batch. This change also changes the rollover service to not do a reroute and instead perform a single reroute for multiple batched rollovers. Relates to elastic#77466 Closes to elastic#79782

elasticsearchmachine · 2021-11-05T08:29:16Z

💚 Backport successful

Status	Branch	Result
✅	8.0

henningandersen

I were reviewing this as it got merged but thought I would drop my drive by comments anway. Can we add more testing here, to check that the batched case works (perhaps just an integ test doing X rollovers concurrently)?

henningandersen · 2021-11-05T08:30:34Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

@@ -80,6 +90,33 @@ public TransportRolloverAction(
        this.rolloverService = rolloverService;
        this.client = client;
        this.activeShardsObserver = new ActiveShardsObserver(clusterService, threadPool);
+        this.rolloverTaskExecutor = (currentState, tasks) -> {


Can we turn this lambda into an explicit class? It makes it easier to reason about the state that it uses (should not depend on local state).

henningandersen · 2021-11-05T08:32:02Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

                final List<Condition<?>> trialMetConditions = rolloverRequest.getConditions()
                    .values()
                    .stream()
                    .filter(condition -> trialConditionResults.get(condition.toString()))
                    .collect(Collectors.toList());

+                final RolloverResponse trailRolloverResponse = new RolloverResponse(


I think this should be trial, not trail?

henningandersen · 2021-11-05T08:36:42Z

...r/src/main/java/org/elasticsearch/action/admin/indices/rollover/TransportRolloverAction.java

+        private final RolloverResponse trialRolloverResponse;
+        private final ActionListener<RolloverResponse> listener;
+
+        private final AtomicBoolean conditionsMet = new AtomicBoolean(false);


Could this not be a plain boolean?

Also, I wonder if changed or performedRollover were better names - since this is used to figure out if anything changed when the clusterStateProcessed method is called?

I think this can be a boolean, but it should be volatile? I'm not 100% sure whether performRollover() method is executed by the same thread as clusterStateProcessed() method.

I like the name clusterStateProcessed and will change it to that.

I'm not 100% sure whether performRollover() method is executed by the same thread as clusterStateProcessed() method.

I think it is executed by the same thread. I will change it to be an ordinary boolean.

In cases where many indices are managed by ILM, it is likely that rollovers for different indices or data streams happen concurrently. This change allows the cluster state updates that these rollovers generate to be batch. This change also changes the rollover service to not do a reroute and instead perform a single reroute for multiple batched rollovers. Relates to #77466 Closes to #79782

Relates to elastic#79945

… test. (#80397) Code cleanups around rollover executor in TransportRolloverAction and added an integration test that tests rollover concurrently. Relates to #79945

… test. (elastic#80397) Code cleanups around rollover executor in TransportRolloverAction and added an integration test that tests rollover concurrently. Relates to elastic#79945

… test. (#80397) (#80478) Code cleanups around rollover executor in TransportRolloverAction and added an integration test that tests rollover concurrently. Relates to #79945 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

elasticsearchmachine added the v8.1.0 label Oct 27, 2021

martijnvg force-pushed the batch_rollover branch from c8f326a to bff9018 Compare October 28, 2021 07:58

martijnvg force-pushed the batch_rollover branch from bc35a15 to 41ba6e1 Compare October 28, 2021 13:04

martijnvg changed the title ~~Batch rollover cluster state update tasks~~ Batch rollover cluster state updates. Oct 28, 2021

martijnvg added :Data Management/Indices APIs APIs to create and manage indices and templates >enhancement v8.0.0 labels Oct 28, 2021

martijnvg requested a review from original-brownbear October 28, 2021 13:05

martijnvg marked this pull request as ready for review October 28, 2021 13:06

elasticmachine added the Team:Data Management Meta label for data/management team label Oct 28, 2021

martijnvg added the v7.16.1 label Oct 28, 2021

original-brownbear reviewed Nov 4, 2021

View reviewed changes

martijnvg added 3 commits November 4, 2021 13:09

Merge remote-tracking branch 'es/master' into batch_rollover

4eca23f

limit reason message

cc0aeb9

only reroute if cluster state has been changed

7fadeb8

martijnvg requested a review from original-brownbear November 4, 2021 13:20

original-brownbear reviewed Nov 4, 2021

View reviewed changes

original-brownbear approved these changes Nov 4, 2021

View reviewed changes

martijnvg removed the v7.16.1 label Nov 4, 2021

martijnvg added 3 commits November 4, 2021 15:51

optimize

51077a9

Merge remote-tracking branch 'es/master' into batch_rollover

da020d8

spotless

1e421e8

martijnvg added the auto-backport-and-merge label Nov 5, 2021

martijnvg merged commit ce1ed43 into elastic:master Nov 5, 2021

martijnvg mentioned this pull request Nov 5, 2021

[8.0] Batch rollover cluster state updates. (#79945) #80392

Merged

henningandersen reviewed Nov 5, 2021

View reviewed changes

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Nov 5, 2021

Addressed followups from batch rollover PR.

1aeda48

Relates to elastic#79945

martijnvg mentioned this pull request Nov 5, 2021

Code cleanup in TransportRolloverAction and added concurrent rollover test. #80397

Merged

martijnvg mentioned this pull request Nov 8, 2021

Batch Cluster State Updates in Datastream Rollover #79782

Closed

probakowski mentioned this pull request Nov 10, 2021

Batch auto-create cluster state updates #80644

Closed

mark-vieira added v8.0.0-rc1 and removed v8.0.0 labels Jan 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch rollover cluster state updates. #79945

Batch rollover cluster state updates. #79945

martijnvg commented Oct 27, 2021 •

edited

Loading

elasticmachine commented Oct 28, 2021

original-brownbear commented Nov 2, 2021 •

edited

Loading

original-brownbear left a comment

original-brownbear Nov 4, 2021

martijnvg Nov 4, 2021 •

edited

Loading

original-brownbear Nov 4, 2021

martijnvg Nov 4, 2021 •

edited

Loading

original-brownbear Nov 4, 2021

martijnvg Nov 4, 2021

original-brownbear Nov 4, 2021

martijnvg Nov 4, 2021

original-brownbear Nov 4, 2021

original-brownbear Nov 4, 2021

martijnvg Nov 4, 2021 •

edited

Loading

original-brownbear left a comment

elasticsearchmachine commented Nov 5, 2021

henningandersen left a comment

henningandersen Nov 5, 2021

henningandersen Nov 5, 2021

henningandersen Nov 5, 2021

henningandersen Nov 5, 2021

martijnvg Nov 5, 2021

martijnvg Nov 5, 2021

Batch rollover cluster state updates. #79945

Batch rollover cluster state updates. #79945

Conversation

martijnvg commented Oct 27, 2021 • edited Loading

elasticmachine commented Oct 28, 2021

original-brownbear commented Nov 2, 2021 • edited Loading

original-brownbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg Nov 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg Nov 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg Nov 4, 2021 • edited Loading

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Nov 5, 2021

💚 Backport successful

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg commented Oct 27, 2021 •

edited

Loading

original-brownbear commented Nov 2, 2021 •

edited

Loading

martijnvg Nov 4, 2021 •

edited

Loading

martijnvg Nov 4, 2021 •

edited

Loading

martijnvg Nov 4, 2021 •

edited

Loading