Refactor TransportShardBulkAction to better support retries #31821

bleskes · 2018-07-05T12:11:35Z

Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a RetryOnPrimaryException, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure.

This PR refactors how we deal with retries with the goal of removing RetryOnPrimaryException and RetryOnReplicaException (not done yet). It achieves so by introducing a class PrimaryExecutionContext that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through:

A translation phase for updates
Execution phase (always index/delete)
Two kinds of retries
A finalization phase which allows updates to the index/delete result to an update result.

This PR is still rough around the edges. There are no proper unit tests and the IT tests roughly pass. It is in a good enough shape to get feedback from people to see if this is where we want things to go.

If we like it, the same approach can be applied to the replica execution.

elasticmachine · 2018-07-05T12:11:36Z

Pinging @elastic/es-distributed

bleskes · 2018-08-02T05:54:43Z

@ywelsch thanks for taking a look. This one is not easy to review. I addressed all your comments. Can you please take another look?

ywelsch

Left some smaller comments but LGTM o.w.. I could not find any unit test where check the situation where the waitForMappingUpdate fails, maybe good to look into that.

ywelsch · 2018-08-09T07:54:05Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+        currentItemState = ItemProcessingState.EXECUTED;
+        final DocWriteRequest docWriteRequest = getCurrentItem().request();
+        markAsCompleted(new BulkItemResponse(getCurrentItem().id(), docWriteRequest.opType(),
+            // Make sure to use request.index() here, if you use docWriteRequest.index() it will use the


I was confused because request.index() does not exist here. There is getCurrentItem().index() though.

fair point. Updated the comment

ywelsch · 2018-08-09T08:03:34Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+     * received from the user (specifically, an update request is translated to an indexing or delete request).
+     */
+    public void setRequestToExecute(DocWriteRequest writeRequest) {
+        assert currentItemState != ItemProcessingState.TRANSLATED &&


instead of ruling out the states which don't allow this to be called, I think it's easier to understand if we put the states where we allow this to be called.

I think this should just be assert currentItemState == INITIAL

agreed. This started before I had the explicit reset and back to initial. It grew out of hand. I took your suggestion and added some :)

ywelsch · 2018-08-09T10:12:42Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+
+    /** completes the operation without doing anything on the primary */
+    public void markOperationAsNoOp(DocWriteResponse response) {
+        assert currentItemState != ItemProcessingState.EXECUTED &&


I think this is only called in INITIAL state, so let's assert currentItemState == INITIAL

ywelsch · 2018-08-09T10:16:28Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+    /** the current operation has been executed on the primary with the specified result */
+    public void markOperationAsExecuted(Engine.Result result) {
+        assert currentItemState == ItemProcessingState.TRANSLATED: currentItemState;
+        assert executionResult == null : executionResult;


I wonder if we can add this (and similar ones) as invariant to the class (similar as was done for ReplicationTracker) we then call assert invariant() on each of these methods.
For example, one invariant might state that if we are in TRANSLATED state, the executionResult is null.

ywelsch · 2018-08-09T10:42:35Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+        return new BulkShardResponse(request.shardId(), responses);
+    }
+
+    private static boolean isAborted(BulkItemResponse response) {


please move this method up next to findNextNonAborted

ywelsch · 2018-08-09T10:47:31Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+        for (int i = 0; i < items.length; i++) {
+            responses[i] = items[i].getPrimaryResponse();
+        }
+        return new BulkShardResponse(request.shardId(), responses);


this whole method can be abbreviated to

return new BulkShardResponse(request.shardId(), Arrays.stream(request.items()).map(BulkItemRequest::getPrimaryResponse).toArray(BulkItemResponse[]::new));

ywelsch · 2018-08-09T10:55:15Z

server/src/test/java/org/elasticsearch/action/bulk/TransportShardBulkActionTests.java

+            .primaryTerm(0, 1).build();
+    }
+
+    private ClusterService clusterService;


forgot to remove this in the commit I added. This is not needed anymore by tests

bleskes · 2018-08-09T15:20:28Z

I could not find any unit test where check the situation where the waitForMappingUpdate fails

See testExecuteBulkIndexRequestWithErrorWhileUpdatingMapping and other usages of ThrowingMappingUpdatePerformer.

@ywelsch Can you please take another look?

ywelsch · 2018-08-09T15:28:14Z

See testExecuteBulkIndexRequestWithErrorWhileUpdatingMapping and other usages of ThrowingMappingUpdatePerformer.

those are tests where the mapping updates fails. I meant the situation where the subsequent waitForMappingUpdate fails (i.e. https://github.com/elastic/elasticsearch/pull/31821/files#diff-720a796f6beda1dfa6af60b45ffe1010R225 ).

bleskes · 2018-08-09T15:30:18Z

those are tests where the mapping updates fails. I meant the situation where the subsequent waitForMappingUpdate fails

I see. Let me work something up.

ywelsch

LGTM. Thanks for this PR and all the assertions!

ywelsch · 2018-08-09T15:49:37Z

server/src/main/java/org/elasticsearch/action/bulk/BulkPrimaryExecutionContext.java

+    /** returns a translog location that is needed to be synced in order to persist all operations executed so far */
+    public Translog.Location getLocationToSync() {
+        assert hasMoreOperationsToExecute() == false;
+        assert assertInvariants(ItemProcessingState.INITIAL, ItemProcessingState.COMPLETED);


I would have expected for this to only be INITIAL?

if you have a bulk with all aboreted items you can overflow in advance and have initial here.

My comment was that we should always end up in INITIAL, because we always move to INITIAL after completing an item

GRR. Misread your comment. That is correct as far as I can tell. I pushed 5eeb932

bleskes · 2018-08-10T08:15:14Z

Thanks @ywelsch for the review and the good suggestions.

Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure. This commit refactors how we deal with retries with the goal of removing `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through: 1) A translation phase for updates 2) Execution phase (always index/delete) 3) Waiting for a mapping update to come in, if needed 4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns) 5) A finalization phase which allows updates to the index/delete result to an update result.

#31821 introduced an unreleased bug where NOOP updates were incorrectly mutating the bulk shard request, inserting null item to be replicated, which would result in NullPointerExceptions when serializing the request to be shipped to the replicas. Closes #32808

bleskes added 8 commits July 4, 2018 15:50

first commit

b482a5a

back to type safety

a647166

Merge remote-tracking branch 'upstream/master' into bulk_retry

f452173

looks like it's working?

296b360

checkstyle

627de7c

advance to clear executionResult

e100321

fix alias usage

44bd927

when failing an update, we should set request to execute.

5000fb4

bleskes added >enhancement WIP :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Jul 5, 2018

bleskes changed the title ~~WIP: Refactor TransportShardBulkAction to be support retries~~ WIP: Refactor TransportShardBulkAction to better support retries Jul 5, 2018

bleskes added 17 commits July 14, 2018 16:20

Merge remote-tracking branch 'upstream/master' into bulk_retry

aa2add0

blocking wait for mapping update.

fbff41f

fix update's created

9bd97b1

fix retry

acc27f4

fix noop execution path

384ed28

better error reporting

b5a89bd

wrong request will blow up

3b2a728

Merge remote-tracking branch 'upstream/master' into bulk_retry

e5a9aa4

Merge remote-tracking branch 'upstream/master' into bulk_retry

5973215

Merge remote-tracking branch 'upstream/master' into bulk_retry

e3ec038

wip unit tests

cd5a431

Merge remote-tracking branch 'upstream/master' into bulk_retry

a065860

completed transport shard unit tests

9aa3f92

remove async call back

6be72bf

fix tests

e9e6471

fix test

e4d677c

fix update response

7228f78

bleskes added 4 commits August 1, 2018 08:58

Merge remote-tracking branch 'upstream/master' into bulk_retry

2615035

initial feedback

bb5c84e

comment formatting

67eae0e

Merge remote-tracking branch 'upstream/master' into bulk_retry

a5cf8ca

bleskes and others added 2 commits August 8, 2018 18:23

Merge remote-tracking branch 'upstream/master' into bulk_retry

d4bfe2a

Rework control flow in TransportShardBulkAction

a5a4612

ywelsch approved these changes Aug 9, 2018

View reviewed changes

bleskes added 2 commits August 9, 2018 16:34

Merge remote-tracking branch 'upstream/master' into bulk_retry

925fac2

feedback

3858c43

test failing while waiting for a mapping update

02e05d4

bleskes requested a review from ywelsch August 9, 2018 15:43

ywelsch approved these changes Aug 9, 2018

View reviewed changes

initial only

5eeb932

bleskes merged commit f58ed21 into elastic:master Aug 10, 2018

bleskes deleted the bulk_retry branch August 10, 2018 08:15

bleskes added v7.0.0 v6.5.0 labels Aug 10, 2018

dnhatn mentioned this pull request Aug 13, 2018

IndicesRequestIT#testBulk fails with NullPointerException #32808

Closed

ywelsch mentioned this pull request Aug 13, 2018

Fix NOOP bulk updates #32819

Merged

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

dakrone mentioned this pull request Feb 13, 2019

Remove immediate operation retry after mapping update #38873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor TransportShardBulkAction to better support retries #31821

Refactor TransportShardBulkAction to better support retries #31821

bleskes commented Jul 5, 2018 •

edited

elasticmachine commented Jul 5, 2018

bleskes commented Aug 2, 2018

ywelsch left a comment

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

ywelsch Aug 9, 2018

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

ywelsch Aug 9, 2018

ywelsch Aug 9, 2018

ywelsch Aug 9, 2018

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

bleskes commented Aug 9, 2018

ywelsch commented Aug 9, 2018

bleskes commented Aug 9, 2018

ywelsch left a comment

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

ywelsch Aug 9, 2018

bleskes Aug 9, 2018

bleskes commented Aug 10, 2018

Refactor TransportShardBulkAction to better support retries #31821

Refactor TransportShardBulkAction to better support retries #31821

Conversation

bleskes commented Jul 5, 2018 • edited

elasticmachine commented Jul 5, 2018

bleskes commented Aug 2, 2018

ywelsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bleskes commented Aug 9, 2018

ywelsch commented Aug 9, 2018

bleskes commented Aug 9, 2018

ywelsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bleskes commented Aug 10, 2018

bleskes commented Jul 5, 2018 •

edited