Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient SegmentMetadataQuery #2107

Merged
merged 1 commit into from
Dec 19, 2015
Merged

More efficient SegmentMetadataQuery #2107

merged 1 commit into from
Dec 19, 2015

Conversation

jon-wei
Copy link
Contributor

@jon-wei jon-wei commented Dec 17, 2015

Addresses issue #2081

This PR makes 3 changes:

  • During merging of SegmentMetadataQuery results, condensing of intervals is postponed. The merging will now just append intervals to the LHS result from the RHS result. JodaUtils.condenseIntervals is now only called when returning a result from the query result Yielder.
  • Adds "Interval" as a AnalysisType to the SegmentMetadataQuery. This determines if the query will return the list of intervals associated with the queried segments.
  • Changes the interval endTime on FireHydrants persisted by RealtimePlumber to use the interval of their Sink which is bucketed by segment granularity, instead of getInterval() on the hydrant's index, which has an end time determined by the max ingested event time. This should help with condensing of intervals.

Running with a 3,450 segment test set on my laptop, the SegmentMetadataQuery takes ~16s to complete without the patch, and ~70ms with the new interval condensing behavior.

@jon-wei jon-wei closed this Dec 17, 2015
@jon-wei jon-wei reopened this Dec 17, 2015
@himanshug
Copy link
Contributor

@jon-wei this might be in conflict with #1995 , can you review/merge that one? I think it is ready and very useful to me in many use cases.

{
@Nullable
@Override
public SegmentAnalysis apply(@Nullable SegmentAnalysis analysis)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nullable is not valid probably. I think IDE is adding it automatically ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@himanshug Hm, yes, it was added by IDE since it was part of the Function interface definition. I've removed the @nullables.

@himanshug
Copy link
Contributor

👍

@jon-wei
Copy link
Contributor Author

jon-wei commented Dec 17, 2015

@himanshug Merged your other PR, resolving conflicts

@himanshug
Copy link
Contributor

@jon-wei see if you need to make changes similar to #2112

@jon-wei
Copy link
Contributor Author

jon-wei commented Dec 18, 2015

@himanshug This one's fine in that respect, the result output won't change since the query uses default analysis types

@jon-wei jon-wei closed this Dec 18, 2015
@jon-wei jon-wei reopened this Dec 18, 2015
@fjy
Copy link
Contributor

fjy commented Dec 18, 2015

👍

@fjy fjy closed this Dec 18, 2015
@fjy fjy reopened this Dec 18, 2015
@jon-wei jon-wei closed this Dec 18, 2015
@jon-wei jon-wei reopened this Dec 18, 2015
@jon-wei jon-wei closed this Dec 18, 2015
@jon-wei jon-wei reopened this Dec 18, 2015
@jon-wei jon-wei closed this Dec 18, 2015
@jon-wei jon-wei reopened this Dec 18, 2015
@jon-wei jon-wei closed this Dec 18, 2015
@jon-wei jon-wei reopened this Dec 18, 2015
fjy added a commit that referenced this pull request Dec 19, 2015
More efficient SegmentMetadataQuery
@fjy fjy merged commit 7019d3c into apache:master Dec 19, 2015
@himanshug
Copy link
Contributor

@jon-wei pls see #2129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants