Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IndexedTable #7373

Merged
merged 1 commit into from Aug 30, 2021
Merged

Conversation

Jackie-Jiang
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang commented Aug 27, 2021

  • When resizing the records map, do not recreate the map but clear it to avoid growing the map again. This can also reduce the GC required
  • When calculating the final results on server side, do not recreate the map but directly create a list to return the results
  • Move the common logic to the base class IndexedTable

Observes less GC and lower memory usage:
Before:
Screen Shot 2021-08-27 at 5 29 42 PM
After:
Screen Shot 2021-08-27 at 5 29 57 PM

@codecov-commenter
Copy link

Codecov Report

Merging #7373 (c9c0de5) into master (c62a1fb) will decrease coverage by 2.19%.
The diff coverage is 81.81%.

❗ Current head c9c0de5 differs from pull request most recent head f223162. Consider uploading reports for the commit f223162 to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7373      +/-   ##
============================================
- Coverage     71.65%   69.46%   -2.20%     
+ Complexity     3311     3217      -94     
============================================
  Files          1515     1121     -394     
  Lines         75006    52894   -22112     
  Branches      10910     7955    -2955     
============================================
- Hits          53749    36745   -17004     
+ Misses        17628    13534    -4094     
+ Partials       3629     2615    -1014     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 69.46% <81.81%> (-0.05%) ⬇️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...not/common/metadata/segment/SegmentZKMetadata.java 47.87% <0.00%> (-24.85%) ⬇️
...ache/pinot/core/data/table/SimpleIndexedTable.java 92.30% <66.66%> (-2.52%) ⬇️
...org/apache/pinot/core/data/table/IndexedTable.java 79.48% <70.00%> (-9.99%) ⬇️
.../pinot/core/data/table/ConcurrentIndexedTable.java 81.81% <80.00%> (-7.74%) ⬇️
...org/apache/pinot/core/data/table/TableResizer.java 91.61% <90.24%> (-1.54%) ⬇️
...ache/pinot/core/data/table/IntermediateRecord.java 100.00% <100.00%> (ø)
...re/data/table/UnboundedConcurrentIndexedTable.java 26.08% <100.00%> (+3.35%) ⬆️
...perator/combine/GroupByOrderByCombineOperator.java 82.71% <100.00%> (ø)
...a/org/apache/pinot/common/metrics/MinionMeter.java 0.00% <0.00%> (-100.00%) ⬇️
...g/apache/pinot/common/metrics/ControllerMeter.java 0.00% <0.00%> (-100.00%) ⬇️
... and 599 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c62a1fb...f223162. Read the comment docs.

} else {
// Fewer records to retain than evict, make PQ of records to retain
Comparator<IntermediateRecord> comparator = _intermediateRecordComparator.reversed();
PriorityQueue<IntermediateRecord> priorityQueue = convertToIntermediateRecordsPQ(recordsMap, size, comparator);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PriorityQueue<IntermediateRecord> priorityQueue = convertToIntermediateRecordsPQ(recordsMap, size, _intermediateRecordComparator.reversed());

protected void resize() {
long startTimeMs = System.currentTimeMillis();
_tableResizer.resizeRecordsMap(_lookupMap, _trimSize);
long resizeTimeMs = System.currentTimeMillis() - startTimeMs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clock is subject to adjustments, so can jump forwards and backwards. For relative time measurement of synchronous code, System.nanoTime() is the right clock to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Changed to use ns to track the resize time

@Jackie-Jiang Jackie-Jiang merged commit f5f072c into apache:master Aug 30, 2021
@Jackie-Jiang Jackie-Jiang deleted the optimize_index_table branch August 30, 2021 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants