Skip to content

Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set#12334

Merged
maytasm merged 8 commits intoapache:masterfrom
maytasm:IMPLY-16895
Mar 18, 2022
Merged

Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set#12334
maytasm merged 8 commits intoapache:masterfrom
maytasm:IMPLY-16895

Conversation

@maytasm
Copy link
Contributor

@maytasm maytasm commented Mar 15, 2022

Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set

Description

This bug can cause data lost if segmentGranularity is set in auto compaction.

If segmentGranularity is set, then the intervals of the segments to be compacted may not align with the configured segmentGranularity. We must adjust the interval of the compaction task to fully cover and align with the segmentGranularity to prevent unexpected data lost.

For example,

  • The umbrella interval of the segments to be compacted is 2015-04-11/2015-04-12 but configured segmentGranularity is YEAR, if the compaction task's interval is 2015-04-11/2015-04-12 then we can run into race condition where after submitting the compaction task if a new segment outside of the interval (i.e. 2015-02-11/2015-02-12) got created will be lost as it is overshadowed by the compacted segment (compacted segment has interval 2015-01-01/2016-01-01). Hence, in this case, we must adjust the compaction task interval to 2015-01-01/2016-01-01.

  • The umbrella interval of the segments to be compacted is 2015-02-01/2015-03-01 but configured segmentGranularity is WEEK, if the compaction task's interval is 2015-02-01/2015-03-01 then compacted segments created will be 2015-01-26/2015-02-02, 2015-02-02/2015-02-09, 2015-02-09/2015-02-16, 2015-02-16/2015-02-23, 2015-02-23/2015-03-02. The compacted segment would cause existing data from 2015-01-26 to 2015-02-01 and 2015-03-01 to 2015-03-02 to be lost. Hence, in this case, we must adjust the compaction task interval to 2015-01-26/2015-03-02

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@maytasm maytasm changed the title Fix auto compaction task interval when segmentGranularity is set Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set Mar 15, 2022
private final String sha256OfSortedSegmentIds;

public static ClientCompactionIntervalSpec fromSegments(List<DataSegment> segments)
public static ClientCompactionIntervalSpec fromSegments(List<DataSegment> segments, Granularity segmentGranularity)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static ClientCompactionIntervalSpec fromSegments(List<DataSegment> segments, Granularity segmentGranularity)
public static ClientCompactionIntervalSpec fromSegments(List<DataSegment> segments, @Nullable Granularity segmentGranularity)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 60 to 62
// - The umbrella interval of the segments is 2015-02-01/2015-03-01 but configured segmentGranularity is MONTH,
// if the compaction task's interval is 2015-02-01/2015-03-01 then compacted segments created will be
// 2015-01-26/2015-02-02, 2015-02-02/2015-02-09, 2015-02-09/2015-02-16, 2015-02-16/2015-02-23, 2015-02-23/2015-03-02.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how the segments will have these intervals after compacted. Can you elaborate? It would be nice to add your explanation in the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops typo. The configured segmentGranularity in auto compaction is WEEK. Added more details to the comment too.

// 2015-01-26/2015-02-02, 2015-02-02/2015-02-09, 2015-02-09/2015-02-16, 2015-02-16/2015-02-23, 2015-02-23/2015-03-02.
// The compacted segment would cause existing data from 2015-01-26 to 2015-02-01 and 2015-03-01 to 2015-03-02 to be lost.
// Hence, in this case, we must adjust the compaction task interval to 2015-01-26/2015-03-02
interval = JodaUtils.umbrellaInterval(segmentGranularity.getIterable(interval));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at JodaUtils.umbrellaInterval(), I think this code can cause OOM if segmentGranularity.getIterable() returns lots of intervals. JodaUtils.umbrellaInterval() doesn't seem to need to store all startDates and endDates in memory. Instead, it can keep only one pair of minStartDate and maxEndDate which are updated in the loop. Should we fix it in this PR too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we have a guardrail for segmentGranularity to not return such many intervals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a guardrail here. The list of segments passed to this method should be from a single time chunk (from the CompactionSegmentIterator). The single time chuck is then adjusted based on some segmentGranularity. Since the original interval (from the umbrellaInterval of the list of segments) is limited to a single time chunk, segmentGranularity.getIterable() should not returns lots of intervals. The edge case to this is if the granularities are either ALL or NONE. I think we should put a guardrail against ALL or NONE but not here. (actually I think it would blow up in the NewestSegmentFirstIterator before getting here anyway)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the guardrail should be somewhere else than here.

}

@Test
public void testAutoCompactionDutyWithSegmentGranularityCoarserAndNotAlignWithSegment() throws Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this tests week -> month instead of month -> year if it's for the case when the new granularity does not align with the previous?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// from 2015-01-26 to 2015-02-01 and 2015-03-01 to 2015-03-02 to be lost. Hence, in this case,
// we must adjust the compaction task interval to 2015-01-26/2015-03-02
interval = JodaUtils.umbrellaInterval(segmentGranularity.getIterable(interval));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to log some message about the interval extension? If the interval was extended, extra segments may be included in this compaction task and then the total input bytes may exceed inputSegmentSizeBytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some extra logs.
Regarding inputSegmentSizeBytes, I think inputSegmentSizeBytes should actually be deprecated now that the issued compaction task can run in parallel.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Please check the CI failure. It seems legit.

DateTime minStart = minDateTime(startDates.toArray(new DateTime[0]));
DateTime maxEnd = maxDateTime(endDates.toArray(new DateTime[0]));

if (minStart == null || maxEnd == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function now will return an interval of Long.MIN_VALUE, Long.MAX_VALUE when the input intervals is empty. It should throw an exception instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'm OK if you don't want to make this change in this PR. It's optional and up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Fixed.

// 2015-01-26/2015-02-02, 2015-02-02/2015-02-09, 2015-02-09/2015-02-16, 2015-02-16/2015-02-23, 2015-02-23/2015-03-02.
// The compacted segment would cause existing data from 2015-01-26 to 2015-02-01 and 2015-03-01 to 2015-03-02 to be lost.
// Hence, in this case, we must adjust the compaction task interval to 2015-01-26/2015-03-02
interval = JodaUtils.umbrellaInterval(segmentGranularity.getIterable(interval));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the guardrail should be somewhere else than here.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @maytasm!

@maytasm maytasm merged commit dbb9518 into apache:master Mar 18, 2022
@maytasm maytasm deleted the IMPLY-16895 branch March 18, 2022 19:46
TSFenwick pushed a commit to TSFenwick/druid that referenced this pull request Apr 11, 2022
…with segmentGranularity when segmentGranularity is set (apache#12334)

* add impl

* add ITs

* address comments

* address comments

* address comments

* fix failure

* fix checkstyle

* fix checkstyle
@abhishekagarwal87 abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022
writer-jill pushed a commit to writer-jill/druid that referenced this pull request Jun 22, 2022
…with segmentGranularity when segmentGranularity is set (apache#12334)

* add impl

* add ITs

* address comments

* address comments

* address comments

* fix failure

* fix checkstyle

* fix checkstyle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants