Batch open-indices cluster state updates #83760

joegallo · 2022-02-09T21:30:14Z

The crux of this is that it processes the whole batch all at once -- every task succeeds or fails. If we want to do better than that, then we'll need to be more clever about shardLimitValidator.validateShardLimit(currentState, indices). At present, it takes a cluster state, and we want to avoid creating new intermediate cluster states inside this batching executor.

Here's a scenario where the difference here matters. Let's imagine three requests to open indices: 3 indices for the first request, 2 for the second, and 1 for the third (all have 1 primary, 0 replicas). The shardLimitValidator thinks we have space for four more shards. Prior to this PR, the first and third requests would have succeeded, while the second would have failed. With this batching, we could have all of them fail (if they are executed as a single batch).

If we want to avoid that, we could run internal batching, but then we're generating intermediate cluster states just to pass them off to the shardLimitValidator and then we might as well go back to using AckedClusterStateUpdateTask. Or we could rewrite the shardLimitValidator to accept something cheaper that we could build/have throughout.

Yet another approach (my personal favorite) would be to invoke the shardLimitValidator multiple times against the various subsets of possible tasks to execute (e.g. if the current cluster state accepts the shards for the first task's indices (yes), will it accept the first and second tasks' (no), how about the first and third (yes)).

These aren't used from tests or other code, so they might as well be private.

Mostly a bunch of ceremony around moving the innermost openIndices() call to a custom ClusterStateTaskExecutor. For now this does the simplest thing possible and 'just' pulls all the indices from all the requests and gloms them together (well, with de-duplication) into one big openIndices call.

DaveCTurner

Looks good, I left only small comments.

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexStateService.java

DaveCTurner · 2022-02-10T08:37:29Z

FWIW I'm undecided about the ShardLimitValidator question but I do lean towards the simple fail-everything solution proposed here. The limits are supposed to be unreasonably high and in the context of ILM we close-and-open fairly fast so it's hard to hit a limit on open without already having breached it before closing. And some opens are going to have to fail if you hit the limits.

IMO it's a bit of a bug that an automatic close-and-reopen goes through this state where shards don't count towards the limit for a brief period and can therefore fail to reopen like this. We know we're reopening them soon, we should keep their spot in the cluster reserved and fail other things instead.

original-brownbear

Looks great Joe just one addition to David's points.

FWIW I'm undecided about the ShardLimitValidator question but I do lean towards the simple fail-everything solution proposed here.

++ good enough for now IMO

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexStateService.java

plus it parallels what we already have for "indices closed".

elasticmachine · 2022-02-10T14:27:55Z

Pinging @elastic/es-data-management (Team:Data Management)

DaveCTurner

LGTM

elasticsearchmachine · 2022-02-10T14:57:34Z

Hi @joegallo, I've created a changelog YAML for you.

joegallo added 6 commits February 9, 2022 15:55

Cleanup: join string literals

4ab1ab6

Cleanup: clamp down access

3c51b6d

These aren't used from tests or other code, so they might as well be private.

Rename this method to parallel closeIndices

5096773

Put these in constructor argument order

8eb315f

Heck yeah, mocks

4d93974

joegallo added :Data Management/Indices APIs APIs to create and manage indices and templates v8.2.0 labels Feb 9, 2022

joegallo requested review from original-brownbear and DaveCTurner February 9, 2022 21:30

DaveCTurner reviewed Feb 10, 2022

View reviewed changes

joegallo added 2 commits February 10, 2022 08:36

Clamp down visibility

b23ecdd

Switch to a record for the task

f584fab

original-brownbear reviewed Feb 10, 2022

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexStateService.java Outdated Show resolved Hide resolved

Merge branch 'master' into batch-open-indices

1df639c

joegallo force-pushed the batch-open-indices branch from e21727b to 1df639c Compare February 10, 2022 13:49

joegallo added 3 commits February 10, 2022 09:03

Better index array accumulation

a2fb350

Simpler health change log message

8eab294

plus it parallels what we already have for "indices closed".

Switch to a limited list of opening indices

ff2d068

joegallo marked this pull request as ready for review February 10, 2022 14:27

elasticmachine added the Team:Data Management Meta label for data/management team label Feb 10, 2022

joegallo requested review from DaveCTurner and original-brownbear February 10, 2022 14:27

DaveCTurner approved these changes Feb 10, 2022

View reviewed changes

original-brownbear approved these changes Feb 10, 2022

View reviewed changes

joegallo added the >enhancement label Feb 10, 2022

Update docs/changelog/83760.yaml

62f3d7f

joegallo changed the title ~~Batch open-indices~~ Batch open-indices cluster state updates Feb 10, 2022

joegallo merged commit 2f9ceaa into elastic:master Feb 10, 2022

joegallo deleted the batch-open-indices branch February 10, 2022 17:30

joegallo mentioned this pull request Feb 10, 2022

Remove LegacyCTRAListener from open-indices batching #83807

Merged

rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Feb 12, 2022

Batch open-indices cluster state updates (elastic#83760)

d94fdb1

This was referenced Feb 23, 2022

Batch close-indices cluster state updates #84259

Merged

Batch add index block cluster state updates #84374

Merged

Batch cluster state tasks for index opening, closing, and blocks #83432

Closed

tlrx pushed a commit to tlrx/elasticsearch that referenced this pull request Mar 3, 2022

Batch open-indices cluster state updates (elastic#83760)

f7c3597

joegallo mentioned this pull request Mar 8, 2022

Batch add-block-index-to-close, add-index-block and finalize-index-block tasks #81627

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch open-indices cluster state updates #83760

Batch open-indices cluster state updates #83760

joegallo commented Feb 9, 2022

DaveCTurner left a comment

DaveCTurner commented Feb 10, 2022

original-brownbear left a comment

elasticmachine commented Feb 10, 2022

DaveCTurner left a comment

elasticsearchmachine commented Feb 10, 2022

Batch open-indices cluster state updates #83760

Batch open-indices cluster state updates #83760

Conversation

joegallo commented Feb 9, 2022

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner commented Feb 10, 2022

original-brownbear left a comment

Choose a reason for hiding this comment

elasticmachine commented Feb 10, 2022

DaveCTurner left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Feb 10, 2022