Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Transport Shard Bulk Action Async (#39793) #41112

Conversation

original-brownbear
Copy link
Member

This is a dependency of #39504

Motivation:
By refactoring TransportShardBulkAction#shardOperationOnPrimary to async, we enable using DeterministicTaskQueue based tests to run indexing operations. This was previously impossible since we were blocking on the write thread until the update thread finished the mapping update.
With this change, the mapping update will trigger a new task in the write queue instead.
This change significantly enhances the amount of coverage we get from SnapshotResiliencyTests (and other potential future tests) when it comes to tracking down concurrency issues with distributed state machines.

The logical change is effectively all in TransportShardBulkAction, the rest of the changes is then simply mechanically moving the caller code and tests to being async and passing the ActionListener down.

Since the move to async would've added more parameters to the private static steps in this logic, I decided to inline and dry up (between delete and update) the logic as much as I could instead of passing the listener + wait-consumer down through all of them.


backport of #39793 #41006 #40923 #40940 (the initial PR to master was causing test failures that have been resolved in the subsequent PRs in this list, so I squashed them all into one to not break 7.x)

This is a dependency of elastic#39504

Motivation:
By refactoring `TransportShardBulkAction#shardOperationOnPrimary` to async, we enable using `DeterministicTaskQueue` based tests to run indexing operations. This was previously impossible since we were blocking on the `write` thread until the `update` thread finished the mapping update.
With this change, the mapping update will trigger a new task in the `write` queue instead.
This change significantly enhances the amount of coverage we get from `SnapshotResiliencyTests` (and other potential future tests) when it comes to tracking down concurrency issues with distributed state machines.

The logical change is effectively all in `TransportShardBulkAction`, the rest of the changes is then simply mechanically moving the caller code and tests to being async and passing the `ActionListener` down.

Since the move to async would've added more parameters to the `private static` steps in this logic, I decided to inline and dry up (between delete and update) the logic as much as I could instead of passing the listener + wait-consumer down through all of them.
@original-brownbear original-brownbear added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >refactoring backport labels Apr 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@original-brownbear original-brownbear merged commit 233df6b into elastic:7.x Apr 11, 2019
@original-brownbear original-brownbear deleted the async-transport-shard-bulk-action-7.x branch April 11, 2019 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants