[BugFix] The weights for dimension fields not getting set for some ta…#12369
[BugFix] The weights for dimension fields not getting set for some ta…#12369PrachiKhobragade wants to merge 1 commit intoapache:masterfrom
Conversation
…bles during PinotTablePartitionRule While FixedLenBitset is being set for weights, the dimension weight being added can fall out of range of the FixedLenBitset length and not get added
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #12369 +/- ##
=============================================
- Coverage 61.72% 34.85% -26.88%
- Complexity 207 953 +746
=============================================
Files 2426 2350 -76
Lines 132682 128936 -3746
Branches 20506 19948 -558
=============================================
- Hits 81901 44942 -36959
- Misses 44765 80772 +36007
+ Partials 6016 3222 -2794
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
jasperjiaguo
left a comment
There was a problem hiding this comment.
Thanks for the fix! Can you take a look at the failed test?
sajjad-moradi
left a comment
There was a problem hiding this comment.
Other than fixing the failing test, could you also make the following small changes on PinotTablePartitionRule on this PR (or in a separate PR):
Currently provided qps and latency are checked at the beginning of the rule:
if (_input.getQps()
< _params._thresholdMinQpsPartition) { //For a table whose QPS < Q (say 200 or 300) NO partitioning is needed.
LOGGER.info("*Input QPS {} < threshold {}, no partition needed", _input.getQps(),
_params._thresholdMinQpsPartition);
return;
}
if (_input.getLatencySLA()
> _params._thresholdMaxLatencySlaPartition) { //For a table whose latency SLA > L (say 1000ms) NO
// partitioning is needed.
LOGGER.info("*Input SLA {} > threshold {}, no partition needed", _input.getLatencySLA(),
_params._thresholdMaxLatencySlaPartition);
return;
}There's really no need to check qps, because even for low qps tables, if the latency requirement is strict, partitioning is going to be required.
The other suggestion is to bring down the default latency SLA from 1s to 300ms.
…bles during PinotTablePartitionRule
The FixedLenBitset array is set during PinotTablePartitionRule with dimension weights. The length of this array is same as numDimensions in the table. When the parsePredicateList is called in PinotTablePartitionRule https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/recommender/rules/impl/PinotTablePartitionRule.java#L258
The index of the dimension colname may fall out of range for FixedLenBitset array, and thus never get set, leading to partition column not being correctly detected by the algorithm.