Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject query if top hits result window exceeds index max result window #29199

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
6 participants
@liketic
Copy link
Contributor

commented Mar 22, 2018

Add a check for max result window in top hits aggregation, which should not exceed the setting index.max_result_window.

Closes #29190

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Mar 22, 2018

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Mar 22, 2018

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@cbuescher
Copy link
Member

left a comment

Hi @liketic, thanks for picking this up, the PR looks good to me overall. However I found some small problems in the test I believe, would you mind taking a look and maybe simplifying the randomization a bit?

MultiBucketConsumerService.MultiBucketConsumer consumer = new MultiBucketConsumerService.MultiBucketConsumer(DEFAULT_MAX_BUCKETS);
// from + size > maxResultWindow
int from = randomInt(10);
int size = maxResultWindow - from + randomIntBetween(1, 10);

This comment has been minimized.

Copy link
@cbuescher

cbuescher Mar 22, 2018

Member

I think this way size can be negative, e.g. maxResultWindow = 5, from = 10, randomInt = 1 would lead to size = -4 ? Can this calculation be simplified? Maybe just randomize from, then set size to a positive value that leads to from + size > maxResultWindow?


// from + size <= maxResultWindow
int from2 = randomIntBetween(1, maxResultWindow - 5);
int size2 = randomIntBetween(1, maxResultWindow - from);

This comment has been minimized.

Copy link
@cbuescher

cbuescher Mar 22, 2018

Member

I think "from" should be "from2" here. Also I would make the calculation simpler here as well, I think otherwise min might be smaller than max in the calls to "randomIntBetween"

This comment has been minimized.

Copy link
@liketic

liketic Mar 22, 2018

Author Contributor

Thansk @cbuescher , I pushed 803e3f6

@cbuescher cbuescher self-assigned this Mar 22, 2018

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Mar 22, 2018

Pinging @elastic/es-search-aggs

@cbuescher

This comment has been minimized.

Copy link
Member

commented Mar 22, 2018

@liketic thanks for updating the test. In the meanwhile I've been looking into the MAX_INNER_RESULT_WINDOW_SETTING, the other setting that constrains the number of hits in the top hits aggregation. From its javadocs:

/**
     * Index setting describing the maximum value of from + size on an individual inner hit definition or
     * top hits aggregation. The default maximum of 100 is defensive for the reason that the number of inner hit responses
     * and number of top hits buckets returned is unbounded. Profile your cluster when increasing this setting.
     */
    public static final Setting<Integer> MAX_INNER_RESULT_WINDOW_SETTING =
        Setting.intSetting("index.max_inner_result_window", 100, 1, Property.Dynamic, Property.IndexScope);

Because this clearly mentions top hits aggregation, and is much lower than MAX_RESULT_WINDOW_SETTING, I wonder if we shouldn't leave it the only setting limiting the aggregation result size. Maybe @jimczi who opened the issue has an opinion or can explain to me what I'm missing?

@jimczi
Copy link
Member

left a comment

Thanks @liketic and @cbuescher . I forgot that we have the index.max_inner_result_window already which defaults to 100 per bucket so we're fine. Adding the
index. max_result_window on top is just confusing. I'll close the issue, sorry @liketic.

@cbuescher

This comment has been minimized.

Copy link
Member

commented Mar 22, 2018

@liketic sorry for the confusion and the work put into this, the PR was great otherwise.

@liketic liketic closed this Mar 22, 2018

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.