feat(broker): extract doScatter/doReduce hooks in single-stage broker request handlers#18316
Merged
yashmayya merged 6 commits intoapache:masterfrom Apr 25, 2026
Merged
Conversation
…gle-stage engine Add protected extension points to `SingleConnectionBrokerRequestHandler` and `GrpcBrokerRequestHandler` that allow subclasses to inject cached DataTables before the reduce step, without duplicating the scatter-gather path. Changes: - `SingleConnectionBrokerRequestHandler`: extract `doScatter()` + `ScatterResult`; add `mergeWithCachedDataTables()` no-op hook called between scatter and reduce; expose `_brokerReduceService`, `_queryRouter`, `_failureDetector` as `protected final` - `GrpcBrokerRequestHandler`: add `mergeWithCachedStreamingResponses()` no-op hook; `dataTableToStreamingIterator()` now `throws IOException` instead of swallowing it; expose `_streamingReduceService`, `_streamingQueryClient`, `_failureDetector` as `protected final`; add `createGrpcBrokerRequestHandler()` factory hook on `BaseBrokerStarter` - `GrpcBrokerRequestHandlerTest`: round-trip and IOException-propagation unit tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aa86026 to
2245d58
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18316 +/- ##
============================================
+ Coverage 63.60% 63.67% +0.07%
Complexity 1659 1659
============================================
Files 3246 3246
Lines 197510 197549 +39
Branches 30578 30577 -1
============================================
+ Hits 125620 125798 +178
+ Misses 61845 61706 -139
Partials 10045 10045
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…aware subclassing Replace the hook-based design (mergeWithCachedDataTables, mergeWithCachedStreamingResponses) with explicit doScatter/doReduce protected methods in both SingleConnectionBrokerRequestHandler and GrpcBrokerRequestHandler. Subclasses can now own the full processBrokerRequest pipeline — preScatter → doScatter → merge → doReduce — without OSS hooks or ThreadLocals. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nstruction time BrokerReduceService.reduceOnDataTable mutates the passed dataTableMap via iterator.remove() as it processes entries. Computing numServersQueried and numServersResponded lazily from _dataTableMap.size() after reduce returns 0. Snapshot both counts eagerly at ScatterResult construction before the map is passed to doReduce. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yashmayya
reviewed
Apr 24, 2026
xiangfu0
reviewed
Apr 25, 2026
…accept ScatterResult - Add ScatterResultStats inner class to snapshot live-server counts at scatter time, decoupling them from the data table map - Replace ScatterResult constructor (was derived counts from map size) with one that accepts explicit ScatterResultStats, so subclasses can substitute the map without corrupting numServersQueried/numServersResponded - doReduce now accepts ScatterResult directly (drop standalone dataTableMap param); call site uses scatterResult.getDataTableMap() internally - Addresses reviewer comment on PR apache#18316 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ve cache language from OSS - Remove dataTableToStreamingIterator from GrpcBrokerRequestHandler — it has no callers in OSS; only StarTree's GRPC handler uses it - Delete GrpcBrokerRequestHandlerTest (only tested the removed method) - Remove ByteString, Collections, DataTable imports that are no longer used - Scrub cache-specific language from doScatter/doReduce Javadocs - Addresses reviewer comment on PR apache#18316 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yashmayya
reviewed
Apr 25, 2026
Comment on lines
+236
to
+240
| private final long _totalResponseSize; | ||
| private final boolean _timedOut; | ||
| private final Exception _sendException; | ||
| private final int _numServersQueried; | ||
| private final int _numServersResponded; |
Contributor
There was a problem hiding this comment.
Why not hold a ScatterResultStats object inside here instead?
Contributor
Author
There was a problem hiding this comment.
I think the guidance was not to deviate too much from the existing abstractions and minimize the changes in the existing query handling path. that's probably why the code ended up like so.
I can make the change, if you insist.
yashmayya
approved these changes
Apr 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
protected doScatteranddoReducemethods inSingleConnectionBrokerRequestHandlerandGrpcBrokerRequestHandler, replacing the previous hook-method design (mergeWithCachedDataTables,mergeWithCachedStreamingResponses)processBrokerRequestdelegates todoScatterthendoReduce— identical behavior to beforepreScatter → super.doScatter → merge → super.doReducewithout needing OSS hook methods or ThreadLocalsScatterResult.getNumServersQueried()/getNumServersResponded()were computed lazily from_dataTableMap.size(), butBrokerReduceService.reduceOnDataTabledrains the map viaiterator.remove()during reduce. Counts are now snapshotted eagerly atScatterResultconstruction time.Changes
SingleConnectionBrokerRequestHandlermergeWithCachedDataTableshook;processBrokerRequestnow callsdoScatter+doReduceprotected doReduce(originalBrokerRequest, serverBrokerRequest, dataTableMap, scatterResult, scatterGatherStartTimeNs, timeoutMs, rawTableName)— takesdataTableMapandscatterResultseparately so subclasses can pass a merged map while stats always reflect real servers queriedScatterResult: added_numServersQueried/_numServersRespondedfields snapshotted at constructionGrpcBrokerRequestHandlermergeWithCachedStreamingResponseshook;processBrokerRequestnow callsdoScatter+doReduceprotected doReduce(originalBrokerRequest, responseMap, timeoutMs)