Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster metric scans #7920

Merged
merged 5 commits into from
Jan 5, 2022
Merged

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented Dec 17, 2021

This change is motivated by a profile where a user performed a summation over a raw column in a group by query, and a significant amount of time was spent in bounds checks:

Screenshot 2021-12-17 at 13 35 02

Screenshot 2021-12-17 at 13 35 31

This change adds methods fillValues to the ForwardIndexReader interface which can avoid bounds checks and perform vectorized copies when the range of docIds is contiguous. There is no way to avoid bounds checks with the current APIs otherwise as there is no way for the compiler to infer that the docIds array is monotonic.

With

    "noDictionaryColumns": [
      "clicks"
    ]

This speeds up select platform, sum(clicks) from complexWebsite group by platform on a 7.5M row segment by ~25%: 60ms down to 45ms.

Summation is not the best case for the change because it requires conversion from long to double - accumulating the sum as a long would amplify the effect.

@codecov-commenter
Copy link

codecov-commenter commented Dec 17, 2021

Codecov Report

Merging #7920 (3a96ce8) into master (428e3f2) will decrease coverage by 0.11%.
The diff coverage is 49.42%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7920      +/-   ##
============================================
- Coverage     71.37%   71.25%   -0.12%     
- Complexity     4193     4210      +17     
============================================
  Files          1595     1594       -1     
  Lines         82514    82613      +99     
  Branches      12304    12316      +12     
============================================
- Hits          58895    58870      -25     
- Misses        19643    19768     +125     
+ Partials       3976     3975       -1     
Flag Coverage Δ
integration1 28.92% <2.29%> (-0.21%) ⬇️
integration2 27.48% <2.29%> (-0.09%) ⬇️
unittests1 68.20% <49.42%> (-0.03%) ⬇️
unittests2 14.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...readers/forward/BaseChunkSVForwardIndexReader.java 46.15% <18.36%> (-46.95%) ⬇️
...t/segment/spi/index/reader/ForwardIndexReader.java 73.86% <88.88%> (+67.61%) ⬆️
...java/org/apache/pinot/core/common/DataFetcher.java 85.18% <100.00%> (-0.53%) ⬇️
...data/manager/realtime/DefaultSegmentCommitter.java 0.00% <0.00%> (-80.00%) ⬇️
...a/manager/realtime/RealtimeSegmentDataManager.java 50.00% <0.00%> (-25.00%) ⬇️
...er/api/resources/LLCSegmentCompletionHandlers.java 43.56% <0.00%> (-18.82%) ⬇️
.../common/request/context/predicate/EqPredicate.java 66.66% <0.00%> (-13.34%) ⬇️
...data/manager/realtime/SegmentCommitterFactory.java 88.23% <0.00%> (-11.77%) ⬇️
...ache/pinot/core/operator/docidsets/OrDocIdSet.java 86.36% <0.00%> (-11.37%) ⬇️
...altime/ServerSegmentCompletionProtocolHandler.java 51.42% <0.00%> (-6.67%) ⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 428e3f2...3a96ce8. Read the comment docs.

@richardstartin richardstartin force-pushed the faster-metric-scans branch 3 times, most recently from a999be7 to e71d582 Compare December 17, 2021 16:52
@richardstartin richardstartin marked this pull request as ready for review December 17, 2021 17:57
@richardstartin richardstartin changed the title [WIP] faster metric scans faster metric scans Dec 17, 2021
@richardstartin
Copy link
Member Author

richardstartin commented Dec 19, 2021

I added a benchmark to show where the difference comes from, but the benchmark also hints at the minimum cost of query vectorization: an array copy. This cost could be eliminated for reductions.

Benchmark                                                (_blockSize)  (_numBlocks)  Mode  Cnt   Score   Error  Units
BenchmarkFixedByteSVForwardIndexReader.readDoubles              10000          1000  avgt    5  37.284 ± 0.919  ms/op <- bad, long to double conversion
BenchmarkFixedByteSVForwardIndexReader.readDoublesBatch         10000          1000  avgt    5  15.674 ± 0.245  ms/op <- no bounds checks
BenchmarkFixedByteSVForwardIndexReader.readLongs                10000          1000  avgt    5  35.244 ± 1.947  ms/op <- bad, but no type conversion
BenchmarkFixedByteSVForwardIndexReader.readLongsBatch           10000          1000  avgt    5  10.777 ± 0.163  ms/op <- best case, vectorized copy, no type conversion

@@ -91,4 +264,8 @@ public double getDouble(int docId, ChunkReaderContext context) {
return _rawData.getDouble(docId * Double.BYTES);
}
}

private boolean isContiguousRange(int[] docIds, int length) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it'd be helpful to comment a bit when the range can be contiguous and when not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? That doesn't seem like a responsibility of this class, does it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I'm lack of some context to understand this new check. Not necessarily a comment to the code. Would appreciate it if you could shed some light in the conversion here. Just looking at this check method, I assume it'd require the docIds to be unique and sorted in asc. So when that happens and when not. This may help me understand when the optimization can kick in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, got it. However, I think that kind of info would be better off in a readme about the how the query and storage layers interact. I can do that (because understanding this area of the code has been hard for me) but not in this PR. I don't think the comment belongs here, and would be superfluous after reading a decent readme.

@richardstartin
Copy link
Member Author

@siddharthteotia to review before merging.

@siddharthteotia
Copy link
Contributor

@siddharthteotia to review before merging.

Yes, I will make sure to go through this by EOD. Thank you for waiting

@richardstartin
Copy link
Member Author

This feels like a fairly low risk change for a decent improvement to me. Are there any blockers here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants