Add wait_if_ongoing for refresh API increasing refresh reliability #91578
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem Statement
ISSUE: #91579
When we have many frequently update and refresh request for many indices (i know refresh is a resource-intensive api as docs says: ).
it shows that refresh queue would blocked
hundreds of thousands of queued requests
even in 3 nodes with 16G heap, 10 primary shards, 20 replications shards, and 400G storage, ES Version 8.4.3, 96 Core CPUI think this because
REFRESH
thread pool type isThreadPoolType.SCALING
TransportShardRefreshAction
is extends ofTransportReplicationAction
whichforceExecution
is default true in replication(like TransportReplicationAction.java#L200 )REFRESH
API would call InternalEngine refresh withblock = true
(like InternalEngine.java#L1795 ) and in hot threads shows block in acquire lockSo a refresh request would expands to
indices * shards * replications
(in our test case is 30), with blocked executionsProposal
May be we can add a
wait_if_ongoing
parameter in refresh api likeflush
api. which can make refresh requests withnonblocking
. just callingInternalEngine#maybeRefresh
. when it can not acquire a lock, it must be a in-flight refresh task is running