-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue 53: increase the window limit #54
Conversation
You were correct. It just has to be added when the index is created. Made change. |
I noticed that the documentation for index.max_result_window and the original error message refer to the scroll api and the scroll api refers to the search_after parameter (discussed very briefly here). Maybe we should give this a look since it appears to have less of a worst-case performance impact? |
@jimmie good call. let's maybe take a look at this first. @al-niessner ☝️ |
@jordanpadams @jimmie @tloubrieu-jpl The search window requires state as stated in #53 which means abandoning RESTful API because the API would no longer be stateless. While it is an approach, it would mean an overhaul of the API, maybe for the better but most likely not, to use state to paginate through a million entries. Let me try once more to point out that opensearch is not the technology you want if you desire a million records. The idea of opensearch is to search not query. In a query, you request all matching records then post process those records. In a search, you enter terms an look at the top N (usually smaller than 10 but never more than 50 because who ever goes past page 2 on google anymore) records. Adjust the search criteria if not in the first 10 and do it again until what one is looking for is in the first 10 select it and post process that single record -- maybe 2 or 3 if the first is not what you wanted which means you probably go back to adjusting search criteria again. If you look at opensearch and analytics to give a relevance score and limits on return sizes etc you can quickly see that it is not an SQL query that returns a million results. In other words, search is find a needle in a haystack quickly while query is for bulk processing. So, if this need to return or page through a million records is real, then you should probably rethink opensearch or search in general. If the need is to show the top 10 relevant records out if million, then we need to rethink our search and return records. |
@al-niessner going to merge this one. in the future, if possible when we have a PR that will fix a ticket, if we can use the github "keywords" that will automatically close the tickets when it is merged. I think there are a bunch but fixes, resolves, and closes are few |
Ad stated previously in this thread, such a change requires us to abandon
REST and maintain the scroll state.
…On Wed, Aug 17, 2022, 15:19 Jimmie Young ***@***.***> wrote:
I noticed that the documentation for index.max_result_window and the
original error message refer to the scroll api and the scroll api refers to
the search_after parameter (discussed very briefly here
<https://opensearch.org/docs/1.0/opensearch/ux/>). Maybe we should give
this a look since it appears to have less of a worst-case performance
impact?
—
Reply to this email directly, view it on GitHub
<#54 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIUBIQQ4H3MUWGU7SLOQL3VZVQOXANCNFSM5624DNKQ>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
🗒️ Summary
Add it to the index like the documentation says.
⚙️ Test Data and/or Report
None
♻️ Related Issues
fixes #53