-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support multiple intervals in dataSource inputSpec #1988
Conversation
@@ -177,13 +177,13 @@ Here is what goes inside "ingestionSpec" | |||
|Field|Type|Description|Required| | |||
|-----|----|-----------|--------| | |||
|dataSource|String|Druid dataSource name from which you are loading the data.|yes| | |||
|interval|String|A string representing ISO-8601 Intervals.|yes| | |||
|interval|String|This is deprecated, please use intervals.|no| | |||
|intervals|List|A list representing ISO-8601 Intervals.|yes| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list of strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
dd8643a
to
7924d00
Compare
|
||
Preconditions.checkArgument( | ||
interval != null && intervals != null && !intervals.isEmpty(), | ||
"pls specify intervals only" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use full words too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
37a0499
to
60bcd10
Compare
👍 |
} | ||
|
||
@Override | ||
public List<DataSegment> getUsedSegmentsForIntervals( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could potentially return the same segments multiple times, I think. Is that bad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of the case where two of the intervals in the list both partially overlap the same segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even if two intervals in the list overlap each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a test for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
but, it wouldn't matter because list of intervals is always list of "disjoint" intervals ensured by calling JodaUtils.condenseIntervals(..). In case of 2 intervals overlapping same segment, this would have given same segment twice, which again wouldn't matter because caller uses "windowed" segments appropriately.
however, that looks weird from api perspective so updated the code to remove duplicates and also updated the test case to verify same.
60bcd10
to
49f6cd7
Compare
@himanshug looks like a legitimate ci failure. Could you fix that and squash the commits? 👍 from me after that, everything else looks good
|
49f6cd7
to
61aaa09
Compare
@gianm ah that happened due to rebase not updating a new class. fixed |
support multiple intervals in dataSource inputSpec
Changes: - Rename `UsedSegmentChecker` to `PublishedSegmentsRetriever` - Remove deprecated single `Interval` argument from `RetrieveUsedSegmentsAction` as it is now unused and has been deprecated since #1988 - Return `Set` of segments instead of a `Collection` from `IndexerMetadataStorageCoordinator.retrieveUsedSegments()`
so that users can read data from multiple intervals from input dataSource