New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit number of pages on search pagination #1604
Limit number of pages on search pagination #1604
Conversation
@sonalkr132 What is the reason you are picking |
|
@@ -1,8 +1,10 @@ | |||
class SearchesController < ApplicationController | |||
before_action :set_page, only: :show | |||
LAST_PAGE = 333 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this number is a little hard to understand just looking at the code. Maybe a comment or link to this PR would be helpful for future contributors.
At my company we also had to solve the same problem after upgrading Elasticsearch. To avoid limiting the last page, I would recommend scrolling through pages instead, using a scroll page size of |
Hi @janko-m Thank you for having a look ✌️
|
Yes, I was recommending the scroll API. I wasn't aware of this warning, but we are using scrolling for serving real time requests in production; in my benchmarks fetching the first 10,000 documents using the scrolling API had equal performance to fetching the first 10,000 documents without using the scrolling API, so there should be no performance downsides. Naturally, fetching a second page of 10,000 documents is going to be less performant than fetching the first page, but it's usually like that with pagination. Considering it's not possible to fetch the second page of 10,000 documents with Note that if you'll be using the scroll API, it's a good idea to manually clear the scroll cursor after the requested documents have been fetched. In that case it's probably best to combine |
Hi all, sorry for being so late to the party. First of all, I wonder what is the rationale for allowing people to load the 1,000th (or even 100th) page with results. I would argue that people look at couple of first pages at most, and the majority doesn't even click to the second page (but of course, website analytics would give a more realistic answer). Loading the 10,000th page is very costly for Elasticsearch, since it has to fetch all the results, and skip and throw out the 9,000 "pages". For this reason, eg. Kaminari has the I guess there are valid reasons for using the "Scroll" API for returning the whole set of results, as @janko-m suggests, but in my experience this is limited to a narrow set of use-cases. The "Scroll" API is indeed not intended for a user-facing operation like paging the search results on a webpage, but for something like exporting the results to CSV, for example. So, in practical terms, I would set a hard limit on the 100th page of results, in the spirit of Kaminari's Again, sorry for the late reply, please ping me in any Rubygems.org issue related to Elasticsearch, I'll do my best to help. |
I'm totally on setting a reasonable limit short of 1000 pages.
On Thu, May 4, 2017 at 4:33 AM Karel Minarik ***@***.***> wrote:
Hi all, sorry for being so late to the party.
First of all, I wonder what is the rationale for allowing people to load
the 1,000th (or even 100th) page with results. I would argue that people
look at couple of first pages at most, and the majority doesn't even click
to the second page (but of course, website analytics would give a more
realistic answer). Loading the 10,000th page is very costly for
Elasticsearch, since it has to fetch all the results, and skip and throw
out the 9,000 "pages".
For this reason, eg. *Kaminari*
<https://github.com/kaminari/kaminari#configuring-kaminari> has the
max_pages configuration option, which sets the hard limit for paging.
I guess there *are* valid reasons for using the "Scroll" API for
returning the whole set of results, as @janko-m
<https://github.com/janko-m> suggests, but in my experience this is
limited to a narrow set of use-cases. The "Scroll" API is indeed not
intended for a user-facing operation like paging the search results on a
webpage, but for something like exporting the results to CSV, for example.
So, in practical terms, I would set a hard limit on the 100th page of
results, in the spirit of Kaminari's max_pages variable. With 30 results
per page, that's still 3,000 search results, presumably a *lot* for the
user to find what they are looking for — or to change their query.
Again, sorry for the late reply, please ping me in any Rubygems.org issue
related to Elasticsearch, I'll do my best to help.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1604 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHQQtuSv7xeAqjObYXCympvBTbQzszuks5r2Y1hgaJpZM4NDQ6O>
.
--
---
David Radcliffe
|
ed54d41
to
cf6c6cc
Compare
This PR does the same. I doubt we need to tinker with will_paginate to set this limit everywhere. |
@sonalkr132 I still think 333 pages is maybe too much, I would cap it at 100 pages. |
|
||
def show | ||
return unless params[:query] && params[:query].is_a?(String) | ||
@page = LAST_PAGE if @page > LAST_PAGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is really edge case, but what about to show proper warning or error message?
cf6c6cc
to
15ac06d
Compare
100 pages is good. 👍 I think we need to return an error and fail the request if the page is higher than that. We don't want to return a 200 and don't want the same results returned for pages >100. Looks like github returns 404 for this case which is what I would do too. |
Fixes: https://app.honeybadger.io/projects/40972/faults/33345740 It was erroring for large page numbers: ``` org.elasticsearch.search.query.QueryPhaseExecutionException: Result window is too large, from + size must be less than or equal to: [10000] but was [74520]. ``` `terminate_after` was considered but it doesn't seem to take precedence over `from`.
15ac06d
to
da013dc
Compare
Updated! @simi Sorry for not paying heed to your suggestion 👷 |
@sonalkr132 don't worry. You're doing amazing work. |
Fixes: https://app.honeybadger.io/projects/40972/faults/33345740
It was erroring for large page numbers:
terminate_after
was considered but it doesn't seem to take precedenceover
from
.cc: @karmi