Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue 53: increase the window limit #54

Merged
merged 1 commit into from
Aug 25, 2022
Merged

issue 53: increase the window limit #54

merged 1 commit into from
Aug 25, 2022

Conversation

al-niessner
Copy link
Contributor

@al-niessner al-niessner commented Aug 17, 2022

🗒️ Summary

Add it to the index like the documentation says.

⚙️ Test Data and/or Report

None

♻️ Related Issues

fixes #53

@al-niessner al-niessner self-assigned this Aug 17, 2022
@al-niessner al-niessner requested a review from a team as a code owner August 17, 2022 20:43
@al-niessner
Copy link
Contributor Author

@jordanpadams @jimmie

You were correct. It just has to be added when the index is created. Made change.

@jimmie
Copy link
Member

jimmie commented Aug 17, 2022

I noticed that the documentation for index.max_result_window and the original error message refer to the scroll api and the scroll api refers to the search_after parameter (discussed very briefly here). Maybe we should give this a look since it appears to have less of a worst-case performance impact?

@jordanpadams
Copy link
Member

@jimmie good call. let's maybe take a look at this first.

@al-niessner ☝️

@al-niessner
Copy link
Contributor Author

al-niessner commented Aug 20, 2022

I noticed that the documentation for index.max_result_window and the original error message refer to the scroll api and the scroll api refers to the search_after parameter (discussed very briefly here). Maybe we should give this a look since it appears to have less of a worst-case performance impact?

@jordanpadams @jimmie @tloubrieu-jpl

The search window requires state as stated in #53 which means abandoning RESTful API because the API would no longer be stateless. While it is an approach, it would mean an overhaul of the API, maybe for the better but most likely not, to use state to paginate through a million entries.

Let me try once more to point out that opensearch is not the technology you want if you desire a million records. The idea of opensearch is to search not query. In a query, you request all matching records then post process those records. In a search, you enter terms an look at the top N (usually smaller than 10 but never more than 50 because who ever goes past page 2 on google anymore) records. Adjust the search criteria if not in the first 10 and do it again until what one is looking for is in the first 10 select it and post process that single record -- maybe 2 or 3 if the first is not what you wanted which means you probably go back to adjusting search criteria again. If you look at opensearch and analytics to give a relevance score and limits on return sizes etc you can quickly see that it is not an SQL query that returns a million results. In other words, search is find a needle in a haystack quickly while query is for bulk processing.

So, if this need to return or page through a million records is real, then you should probably rethink opensearch or search in general. If the need is to show the top 10 relevant records out if million, then we need to rethink our search and return records.

@jordanpadams
Copy link
Member

@al-niessner going to merge this one.

in the future, if possible when we have a PR that will fix a ticket, if we can use the github "keywords" that will automatically close the tickets when it is merged. I think there are a bunch but fixes, resolves, and closes are few

@jordanpadams jordanpadams merged commit ac4c6cf into main Aug 25, 2022
@jordanpadams jordanpadams deleted the issue_53 branch August 25, 2022 17:07
@al-niessner
Copy link
Contributor Author

al-niessner commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug with pagination limitations per OpenSearch config
3 participants