improv(batch): Propagate trace entity to worker threads during parallel processing. #2300
+288
−61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.


Summary
This PR fixes an issue when using
@Tracingtogether with parallel batch processing.Before
When using parallel batch processing (without passing a custom
Executor) we use aparallelStream()under the hood. AparallelStream()uses the common fork-join pool plus the main thread for processing. What happened was that some of sub-segments were not correctly attached to the parent## handleRequestsub-segment. The reason is that X-RAY SDK maintains thread local trace entity instances. Hence, only the sub-sgements from the main thread and not those of the fork-join pool threads were correctly attached to the parent segment leading to a tracing output like this:After
This PR introduces propagation of the main thread's trace entity to the worker threads of the parallel batch processing worker threads. This fixes the issue such that all sub-segments are correctly attached to their parent segments. See screenshot after:
Changes
Added
XRayTraceEntityPropagatoras a reflection wrapper around the X-RAY SDK methods needed for trace propagation following documentation at https://docs.aws.amazon.com/xray/latest/devguide/scorekeep-workerthreads.html.The reason why we use reflection is because we should not take a dependency on the X-RAY SDK in the batch module. When X-RAY is not available, this will lead to a no-op.
We also update the documentation with better guidance on when to choose which concurrency model.
Note for reviewers: Created separate issue for addressing code duplication: #2302.
Issue number: #1671
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.