Refactor: Ensure that CompactionCandidate.currentStatus is non-null#19058
Refactor: Ensure that CompactionCandidate.currentStatus is non-null#19058kfaraz merged 5 commits intoapache:masterfrom
Conversation
cecemei
left a comment
There was a problem hiding this comment.
Looks more clear with a non-null compaction status, left a few nits.
| { | ||
| final Set<Interval> segmentIntervals = | ||
| segments.stream().map(DataSegment::getInterval).collect(Collectors.toSet()); | ||
| final Interval umbrellaInterval = JodaUtils.umbrellaInterval(segmentIntervals); |
There was a problem hiding this comment.
wondering if umbrellaInterval would be different from compactionInterval? seems like DataSourceCompactibleSegmentIterator.populateQueue would create DataSegment with updated interval and shard spec, if config.getSegmentGranularity is set.
There was a problem hiding this comment.
Yes, they can be different. Umbrella interval is the interval spanning all the segments (i.e. start of earliest segment to end of latest segment).
Compaction interval is nothing but the umbrella interval expanded to align with the desired segment granularity (if specified).
| @@ -68,7 +86,7 @@ public static CompactionCandidate from( | |||
| umbrellaInterval, | |||
| compactionInterval, | |||
| segmentIntervals.size(), | |||
There was a problem hiding this comment.
nit: could use CompactionStatistics to wrap numIntervals, totalBytes and numSegments.
There was a problem hiding this comment.
Sure, will include in a follow up.
|
Thanks for the review, @cecemei ! |
Description
Inspired by #18968 , this PR tries to ensure that the
currentStatusinside aCompactionCandidateis always non-null,with a relatively smaller changeset.
This PR has: