Results window is too large - Kaminari #577

thorlando · 2016-06-07T04:42:39Z

When using this with Kaminari versus using from + size, the following error occurs when navigating to the last page in the paginated results:

ActionView::Template::Error ([500] {"error":{"root_cause":[{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [11880]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"users","node":"bB32lITgax0OBbQtim3hFbI","reason":{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [11880]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter."}}]},"status":500}):

I'd limit my results to 10,000 if I knew how, but I can't seem to figure it out (similar to how in MySQL you could say SELECT * FROM users LIMIT 10000). Is this an issue with Kaminari?

The text was updated successfully, but these errors were encountered:

Ashviniv · 2016-06-30T12:29:01Z

@realmadrid2727 I was also getting the same error. I fixed it by setting index.max_result_window: 100000 in the elasticsearch configuration file i.e elasticsearch.yml . Basically we need to set it to max possible count of our result set.
And this is not a kaminari issue. It is one of the configurations of elasticsearch. You can read more about it on https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking_21_search_changes.html

chikamichi · 2016-10-13T12:51:55Z

Setting index.max_result_window to 10000, which is ES' default value btw, might not be considered a proper fix. The failure occurs because the dataset has more than 10000 records and the user is trying to access a page beyond that limit, breaking the constraint that from + size cannot be more than the index.max_result_window.

To make the deep pagination work "properly", one can either:

increase max_result_window — which is limited by memory/cpu consumption & cluster size
use searchAfter — but it may not be trivial to implement in one's code, for it basically works like the scroll API: one needs to specify an ID for offsetting the search, and finding that ID in a reliable way requires parsing the pages, starting on the first one

michniewicz · 2016-10-18T13:02:31Z

Can you please give us the status on this issue? Since increasing the max_result_window is a bad idea, what is a better idea to deal with large data sets (ex. pagination with scroll api)?

c-moyer · 2017-08-30T14:43:50Z

Same problem

NickCraver · 2017-09-27T13:45:51Z

Is there any recommendation here on how to handle large page numbers in large result sets? We're now hitting this on Stack Overflow as well.

spodlecki · 2018-05-03T16:34:08Z

bump on this... I'd love to hear what others are doing... it seems like searchAfter might be the easiest option, but it really feels silly to be having to pass through the last known ID on a search when that isn't pagination at all. Keeping a session in search just isn't an option.

karmi · 2018-05-15T07:46:44Z

Argh, this went under my radar, sorry for the silence, everybody.

The search_after is indeed the right way to work around the "result window too large" limit, because it is something which can be optimized on the Elasticsearch side, as opposed to just increasing the limit, which would potentially overload the server. (That would be a problem for any database, I imagine.)

As a note, the Scroll API is not meant to be used for "regular searching", especially in a high concurrency scenarios (many people searching at the same time), since it keeps an "open window" into the index, and that can again break. The Scroll API is great for something like "export this dataset" feature, which is run only now and then, either by people or by a cronjob.

Supporting pagination over a very large dataset probably needs a bit of re-designing and re-implementing, going in a different direction than the regular from/size pattern. As I've mentioned on couple of other issues, we have scheduled an effort to re-focus on the Rails integration in the upcoming months, so we'll try to have a look at this particular problem as well!

spodlecki · 2018-05-26T01:39:09Z

To follow up here, provided business end of things with the different options and after a bit of discussion, we ended up just upping the limit to an acceptable level (~60k) to cover most of the most viewed content. At the eod, we're still using from/size -- I'm hoping there is no plans to really remove this entirely?? It'd seem pretty silly in the sense of actual pagination to get rid of something like this

unavailabl3 · 2019-04-01T11:18:47Z

products = []
after = 0
loop do
  query = {
    size: query_size,
    query: {
      multi_match: {
        query: "black",
        fields: [ "description", "title", "information", "params" ]
      }
    },
    search_after: [after],
    sort: [ {id: "asc"} ]
  }
  results =  Product.search(query).records
  results_amount = results.size
  products += results
  after += 10000

  break if results_amount == 0
end

stale · 2020-08-31T13:42:42Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

montdidier · 2020-12-14T06:37:49Z

This really isn't stale.

philsmy · 2021-04-09T07:53:29Z

Is it really possible that 5 years on there still isn't a good way to do front-end driven pagination of large result sets?

We want to let users scroll through a month's worth of orders. Sometimes that is 100,000+ orders. What are we supposed to do? We use datatables on the front end.

lulessa · 2021-04-10T23:07:38Z

Is it really possible that 5 years on there still isn't a good way to do front-end driven pagination of large result sets?

We want to let users scroll through a month's worth of orders. Sometimes that is 100,000+ orders. What are we supposed to do? We use datatables on the front end.

@philsmy Limit/offset pagination is not a scalable pattern for a distributed system like elasticsearch.

Use cursor-based pagination with search_after. See an example (loop) in #577 (comment) for a pattern. In pagination, the cursor for the next page is the sort value of the last doc on the current page. To detect if there is a next page, make size the desired number of docs per page + 1.

stale bot added the stale label Aug 31, 2020

stale bot closed this as completed Sep 7, 2020

picandocodigo added pinned and removed stale labels Dec 14, 2020

picandocodigo reopened this Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results window is too large - Kaminari #577

Results window is too large - Kaminari #577

thorlando commented Jun 7, 2016

Ashviniv commented Jun 30, 2016 •

edited

chikamichi commented Oct 13, 2016 •

edited

michniewicz commented Oct 18, 2016

c-moyer commented Aug 30, 2017

NickCraver commented Sep 27, 2017

spodlecki commented May 3, 2018

karmi commented May 15, 2018

spodlecki commented May 26, 2018

unavailabl3 commented Apr 1, 2019 •

edited

stale bot commented Aug 31, 2020

montdidier commented Dec 14, 2020

philsmy commented Apr 9, 2021

lulessa commented Apr 10, 2021

Results window is too large - Kaminari #577

Results window is too large - Kaminari #577

Comments

thorlando commented Jun 7, 2016

Ashviniv commented Jun 30, 2016 • edited

chikamichi commented Oct 13, 2016 • edited

michniewicz commented Oct 18, 2016

c-moyer commented Aug 30, 2017

NickCraver commented Sep 27, 2017

spodlecki commented May 3, 2018

karmi commented May 15, 2018

spodlecki commented May 26, 2018

unavailabl3 commented Apr 1, 2019 • edited

stale bot commented Aug 31, 2020

montdidier commented Dec 14, 2020

philsmy commented Apr 9, 2021

lulessa commented Apr 10, 2021

Ashviniv commented Jun 30, 2016 •

edited

chikamichi commented Oct 13, 2016 •

edited

unavailabl3 commented Apr 1, 2019 •

edited