-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat #1681
Conversation
private List<DataSegment> segments; | ||
private Interval INTERVAL_FULL = new Interval("2014-10-22T00:00:00Z/P1D"); | ||
private Interval INTERVAL_PARTIAL = new Interval("2014-10-22T00:00:00Z/PT2H"); | ||
private DataSegment SEGMENT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since they are non-static, can we call them intervalFull, intervalPartial and segment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was thinking they were static; should they just be static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, they don't change, we can make everything static as well if you want.. I am fine both ways as it doesn't matter in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, updated them to statics
👍 |
a6148dc
to
4094df8
Compare
4094df8
to
d8f7c82
Compare
…mat. Fixes apache#1678. IngestSegmentFirehose (and its users) need to remember which windows of which segments should actually be read, based on a timeline.
d8f7c82
to
414a6fb
Compare
Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat
should we backport this for 0.8.1-rc3? (and if so who does that?) |
(Backport #1681) Fix overlapping segments in IngestSegmentFirehose, DatasourceInputFormat.
List<DataSegment> segmentsList = segmentLister.getUsedSegmentsForInterval( | ||
ingestionSpecObj.getDataSource(), | ||
ingestionSpecObj.getInterval() | ||
); | ||
datasourcePathSpec.put(segments, segmentsList); | ||
VersionedIntervalTimeline<String, DataSegment> timeline = new VersionedIntervalTimeline<>(Ordering.natural()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gianm , I was wondering why we need to scan the timeline here since the returned segments from segmentLister.getUsedSegmentsForInterval
already did that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guobingkun We also need to know which sections of those segments we should scan. Some segments may only be partially valid due to overshadowing. (you might have a newer hourly granularity segment partially overshadowing an older day granularity segment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks!
Fixes #1678. IngestSegmentFirehose (and its users) need to remember which
windows of which segments should actually be read, based on a timeline.