New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preliminary support for page_above
in base entry collections
#1560
Conversation
af6d4b6
to
90fcb3b
Compare
Codecov Report
@@ Coverage Diff @@
## master #1560 +/- ##
==========================================
- Coverage 91.09% 91.04% -0.06%
==========================================
Files 74 74
Lines 4538 4578 +40
==========================================
+ Hits 4134 4168 +34
- Misses 404 410 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
72d535a
to
e424b05
Compare
more_data_available = page_offset + limit < data_returned | ||
else: | ||
# Needs to be set by some custom elastic query: is the ID the last ID? | ||
more_data_available = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More data is only available, if data_returned == len(results)
. There is the inevitable chance that you will do a request with empty response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand your comment.
I think data_returned
is the total number of entries that match the filter, while len(results)
is the number of returned entries. So if data_returned == len(results)
all the matching entries have been returned and no more data is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify here, I was working under a potentially wonky assumption that search_after
would mangle the total hits for a filter, but perhaps this is not the case, in which case we can still directly use data_returned
to set more_data_available
(as suggested by @JPBergsma below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the documentation of Elasticsearch, they recommand setting "track_total_hits": false
to speed things up.
So I would expect that data_returned
would give a sane result as we have set "track_total_hits": true
.
We could perhaps include the value of data_returned
in the next link, so we can set "track_total_hits": false
when retrieving the rest of the data.
Hi @markus1978 and @JPBergsma, I think the bones of this are all in now, even if we are not testing I think I'd like to merge this PR and then continue working on the actual |
- Fix potential for infinite loop in broken test
654a304
to
2700ee7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I see so far looks good.
Good luck with creating the implementations.
page_above
in base entry collectionspage_above
in base entry collections
This PR should make it easier to customize which pagination mechanism each backend supports, by adding a class attribute for it (that follows the names of the query parameters).
All supported pagination parameters are now passed down to the collection (with
page_cursor
andpage_below
still left as unimplemented for all collections).Once this PR is merged, we can work to make
page_above
the default ES pagination mechanism (i.e., the one that is used innext
links) and maybe even consider disabling offset-based pagination for ES. It might be cleaner to move the unsupported params attribute down to the collection itself too.page_above
through query param handler