New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InternalPercentilesBucket should not rely on ordered percents array #24336

Merged
merged 3 commits into from Apr 26, 2017

Conversation

Projects
None yet
2 participants
@cbuescher
Member

cbuescher commented Apr 26, 2017

Currently InternalPercentilesBucket#percentile() relies on the percent array passed in to be in sorted order. This changes the aggregation to store an internal lookup table that is constructed from the percent/percentiles arrays passed in that can be used to look up the percentile values.

Closes #24331

public InternalPercentilesBucket(String name, double[] percents, double[] percentiles,
DocValueFormat formatter, List<PipelineAggregator> pipelineAggregators,
Map<String, Object> metaData) {
super(name, pipelineAggregators, metaData);
assert percentiles.length == percents.length;

This comment has been minimized.

@cbuescher

cbuescher Apr 26, 2017

Member

Maybe we should even throw an error if this is not the case, wdyt?

This comment has been minimized.

@polyfractal

polyfractal Apr 26, 2017

Member

++ I can't see anything good happening if they aren't the same length. The Iterator for example will throw an out-of-range exception if they don't match

This comment has been minimized.

@cbuescher

cbuescher Apr 26, 2017

Member

I will change this to throw an IllegalArgumentException then and add a test for it.

@colings86 colings86 requested review from polyfractal and removed request for colings86 Apr 26, 2017

@polyfractal

This comment has been minimized.

Member

polyfractal commented Apr 26, 2017

I'm wondering if we should perhaps use a map from the start, rather than converting from the double arrays? E.g. in PercentilesBucketPipelineAggregator#buildAggregation(), we just build a map from the beginning and pass that to InternalPercentilesBucket's ctor.

But maybe there's a reason not to that I haven't thought about. The current code LGTM otherwise :)

@cbuescher

This comment has been minimized.

Member

cbuescher commented Apr 26, 2017

@polyfractal thanks for the review, I didn't change the ctor arguments and the underlying implementation because we still want to maintain the original order of the percents the user supplied (ok, that could be solved with a sorted map) and also because currently the serialization relies on reading/writing double arrays. I didn't want to break that at this point. If you think its worth it let me know...

@polyfractal

This comment has been minimized.

Member

polyfractal commented Apr 26, 2017

Ah good point re: serialization and user values. Ignore me!

@cbuescher cbuescher merged commit db1b243 into elastic:master Apr 26, 2017

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details

cbuescher added a commit that referenced this pull request Apr 26, 2017

InternalPercentilesBucket should not rely on ordered percents array (#…
…24336)

Currently InternalPercentilesBucket#percentile() relies on the percent array passed in
to be in sorted order. This changes the aggregation to store an internal lookup table that
is constructed from the percent/percentiles arrays passed in that can be used to look up
the percentile values.

Closes #24331
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment