Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load rules should honor partial overlap #5595

Merged
merged 1 commit into from
Apr 12, 2018

Conversation

palanieppan-m
Copy link
Contributor

@palanieppan-m palanieppan-m commented Apr 9, 2018

Load rules should load segments that partially overlap with rule window,
instead of loading only segments that fully overlap. Otherwise, valid data is
skipped and result produced is incorrect.

}

public static boolean eligibleForLoad(Period period, Interval interval, DateTime referenceTimestamp)
{
final Interval currInterval = new Interval(period, referenceTimestamp);
return currInterval.overlaps(interval) && interval.getStartMillis() >= currInterval.getStartMillis();
return currInterval.overlaps(interval);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to call eligibleForLoad(currInterval, interval) for easier code maintenance.

@jihoonson
Copy link
Contributor

@palanieppan-m thanks for your contribution! Would you fill CLA (http://druid.io/community/cla.html)?

Copy link
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

}

public static boolean eligibleForLoad(Period period, Interval interval, DateTime referenceTimestamp)
{
final Interval currInterval = new Interval(period, referenceTimestamp);
return currInterval.overlaps(interval) && interval.getStartMillis() >= currInterval.getStartMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone remember or know why && interval.getStartMillis() >= currInterval.getStartMillis() was here in the first place? I can't think of a good reason for it - probably you'd want a load rule to apply if it overlaps a segment all, even if the segment doesn't start within the load rule interval.

Load rules should load segments that partially overlap with rule window,
instead of loading only segments that fully overlap.
@palanieppan-m palanieppan-m force-pushed the pala_fix_bg_segment_merge_orig branch from 48c5562 to b4fb3d0 Compare April 9, 2018 18:17
@palanieppan-m
Copy link
Contributor Author

@jihoonson good suggestion, changed accordingly. Also filled in CLA.

Thanks

@xvrl
Copy link
Member

xvrl commented Apr 9, 2018

@gianm @palanieppan-m I believe we've had this issue raised before and it was rejected at the time, see #1362
Maybe things have changed since, or we are ok doing so now, but we should understand the implications that were raised last time.

@gianm
Copy link
Contributor

gianm commented Apr 9, 2018

Thanks @xvrl. My feeling is that the justification given in #1362 for the current logic is not worth the unintuitive behavior in the case where a segment starts before the load interval but also extends into it. The intuitive thing would be to have the load rule apply if there is any overlap.

I agree with the comment @cheddar made at the time,

Also, those semantics about dropping even though it has relevant data are very unintuitive, was that introduced when we added the router? Just as a general rule, the load/drop/retention rule's semantics should be defined according to what makes sense for the retention of segments.

My feeling is we should go ahead with this PR as-is and then also raise an issue to look into the router's behavior and potentially rethink it (or add a comment in TieredBrokerHostSelector).

cc/ @fjy @cheddar @nishantmonu51 @anubhgup from #1362

@pdeva
Copy link
Contributor

pdeva commented Apr 9, 2018

can this be added to the 0.12.1 milestone?

@jihoonson jihoonson added this to the 0.12.1 milestone Apr 9, 2018
@jihoonson
Copy link
Contributor

@pdeva added.

@xvrl
Copy link
Member

xvrl commented Apr 9, 2018

@gianm I agree the current behavior is unintuitive and we should make the change. However, it still is a backwards incompatible change at the router level, so we should deal with it appropriately and note it in the release notes.

@palanieppan-m
Copy link
Contributor Author

@gianm @xvrl what is a proposal that would be acceptable? Should we have original behavior and restrict it to be used only by router? Is the router using this logic through PeriodBroadcastDistributionRule and IntervalBroadcastDistributionRule?

@gianm
Copy link
Contributor

gianm commented Apr 12, 2018

IMO, it would be acceptable to merge this as-is and just mention in the release notes that router behavior in this case may change as a result. It sounds like I am in agreement with @xvrl on this. However I think on balance the behavior changes means that it makes sense to put this in 0.13.0 not 0.12.1.

Has anyone else got an opinion on this?

@xvrl
Copy link
Member

xvrl commented Apr 12, 2018

0.13.0 sounds like the reasonable choice here unless someone wants to do additional work to preserve the existing router behavior.

@gianm gianm modified the milestones: 0.12.1, 0.13.0 Apr 12, 2018
@gianm
Copy link
Contributor

gianm commented Apr 12, 2018

@xvrl - I agree.

I'm milestoning this to 0.13.0 and will merge it soon if there are no further comments.

@palanieppan-m
Copy link
Contributor Author

thanks folks. I will go ahead and port this into our internal branch for use now.

@b-slim b-slim merged commit dbea5cb into apache:master Apr 12, 2018
sathishsri88 pushed a commit to sathishs/druid that referenced this pull request May 8, 2018
Load rules should load segments that partially overlap with rule window,
instead of loading only segments that fully overlap.
gianm pushed a commit to implydata/druid-public that referenced this pull request May 16, 2018
Load rules should load segments that partially overlap with rule window,
instead of loading only segments that fully overlap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants