Improve 'all' tab pagination to handle edge cases #290

jazairi · 2025-11-26T20:46:48Z

Why these changes are being introduced:

The zipper merge we implemented naively queries n/2 results from each API and interleaves them, where n is the per-page value. This works if both APIs return many results, but it can cause problems in smaller, unbalanced result sets.

For example, the query term doc edgerton returns 50 Primo results and 4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo), and each subsequent page returns only 10 (all Primo).

Relevant ticket(s):

USE-179

How this addresses that need:

This implements more sophisticated logic that first checks the number of hits returned by each API and passes that, along with the pagination information, to a Merged Search Paginator class. This service object develops a 'merge plan', calculates API offsets, and merges the results for each page.

Queries on the 'all' tab now fetch twice from each API: once to determine the total number of hits for the Merged Search Paginator then again to fetch results at the appropriate offset. While hardly ideal, this was the only option I could figure to avoid losing results. I limited these extra calls to queries beyond page 1, which is the only case where they are needed.

Side effects of this change:

We now clear cache before each search controller test. This was done to avoid odd test behavior, but I ran the suite 50 times without any issues, so it might be excessively cautious.
The search controller continues to grow with this new logic. I tried to split things into multiple helper methods, so if we want to move more things to service objects later, it might be easier to do so.
A failing cassette has been replaced with a mock.

Developer

Accessibility

ANDI or WAVE has been run in accordance to our guide.
This PR contains no changes to the view layer.
New issues flagged by ANDI or WAVE have been resolved.
New issues flagged by ANDI or WAVE have been ticketed (link in the Pull Request details above).
No new accessibility issues have been flagged.

New ENV

All new ENV is documented in README.
All new ENV has been added to Heroku Pipeline, Staging and Prod.
ENV has not changed.

Approval beyond code review

UXWS/stakeholder approval has been confirmed.
UXWS/stakeholder review will be completed retroactively.
UXWS/stakeholder review is not needed.

Additional context needed to review

This is a pretty unwieldy changeset, so please reach out if you have questions!

Code Reviewer

Code

I have confirmed that the code works as intended.
Any CodeClimate issues have been fixed or confirmed as
added technical debt.

Documentation

The commit message is clear and follows our guidelines
(not just this pull request message).
The documentation has been updated or is unnecessary.
New dependencies are appropriate or there were no changes.

Testing

There are appropriate tests covering any new functionality.
No additional test coverage is required.

jazairi · 2025-12-03T20:42:42Z

@JPrevost Let me know what you think of the refactor I just pushed up. I know we talked about just adding caching, but this felt like a could opportunity to move the 'all' tab logic to a service and slim down the controller. I find it a bit easier to follow, and the decoupling will become useful if we do move to an external orchestrator. I guess the downside is now we have a 200+ line service. 🙃

I left this a separate commit in case you wanted to see the differences between the first and second efforts. Happy to squash before review instead if that's easier.

Why these changes are being introduced: The zipper merge we implemented naively queries n/2 results from each API and interleaves them, where n is the per-page value. This works if both APIs return many results, but it can cause problems in smaller, unbalanced result sets. For example, the query term `doc edgerton` returns 50 Primo results and 4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo), and each subsequent page returns only 10 (all Primo). Relevant ticket(s): - [USE-179](https://mitlibraries.atlassian.net/browse/USE-179) How this addresses that need: This implements more sophisticated logic that first checks the number of hits returned by each API and passes that, along with the pagination information, to a Merged Search Paginator class. This service object develops a 'merge plan', calculates API offsets, and merges the results for each page. Queries on the 'all' tab now fetch twice from each API: once to determine the total number of hits for the Merged Search Paginator then again to fetch results at the appropriate offset. While hardly ideal, this was the only option I could figure to avoid losing results. I limited these extra calls to queries beyond page 1, which is the only case where they are needed. Side effects of this change: * We now clear cache before each search controller test. This was done to avoid odd test behavior, but I ran the suite 50 times without any issues, so it might be excessively cautious. * The search controller continues to grow with this new logic. I tried to split things into multiple helper methods, so if we want to move more things to service objects later, it might be easier to do so. * A failing cassette has been replaced with a mock.

Why these changes are being introduced: In discussions of the PR in review for USE-179, we determined that the proposed pagination improvements could be more efficient. We had also determined that the code changes were difficult to follow, and could use better documentation. Relevant ticket(s): - USE-179 How this addresses that need: This commit caches the page 1 'summary' API calls, which we use to gather the hit counts from each API to calculate pagination on deeper pages. It also abstracts the 'all' tab code to a 'Merged Search Service', mirroring the design pattern of the Merged Search Paginator, and adds docstrings to the methods in that service. Side effects of this change: We are still making two API calls on deeper pages when the hit totals are not cached. I could not find a workaround to this while still supporting nonlinear pagination. However, the vast majority of users (even bots, presumably) will begin their search at page 1, so hopefully this is a rare occurrence.

coveralls · 2025-12-04T14:05:55Z

Pull Request Test Coverage Report for Build 19934295824

Details

169 of 169 (100.0%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.3%) to 98.127%

Totals
Change from base Build 19906931799:	0.3%
Covered Lines:	1100
Relevant Lines:	1121

💛 - Coveralls

jazairi requested review from JPrevost and matt-bernhardt November 26, 2025 20:46

mitlib temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 November 26, 2025 20:49 Inactive

jazairi force-pushed the use-179-pagination-bug branch from bea3fd9 to 7e80c7d Compare December 1, 2025 14:37

jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 1, 2025 14:37 Inactive

jazairi force-pushed the use-179-pagination-bug branch from 7e80c7d to 3584c1b Compare December 1, 2025 14:44

jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 1, 2025 14:44 Inactive

jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 3, 2025 20:33 Inactive

jazairi force-pushed the use-179-pagination-bug branch from 1e7cf2d to 91ad3c3 Compare December 3, 2025 20:35

jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 3, 2025 20:36 Inactive

JPrevost self-assigned this Dec 4, 2025

jazairi added 2 commits December 4, 2025 09:04

jazairi force-pushed the use-179-pagination-bug branch from 91ad3c3 to 0106e5f Compare December 4, 2025 14:04

jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 4, 2025 14:04 Inactive

Add additional docs for merged_search_service

c8695b0

JPrevost deployed to timdex-ui-pi-use-179-pa-olfde1 December 4, 2025 15:29 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve 'all' tab pagination to handle edge cases #290

Improve 'all' tab pagination to handle edge cases #290

jazairi commented Nov 26, 2025 •

edited by atlassian bot

Loading

Uh oh!

jazairi commented Dec 3, 2025

Uh oh!

coveralls commented Dec 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Improve 'all' tab pagination to handle edge cases #290

Are you sure you want to change the base?

Improve 'all' tab pagination to handle edge cases #290

Conversation

jazairi commented Nov 26, 2025 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why these changes are being introduced:

Relevant ticket(s):

How this addresses that need:

Side effects of this change:

Developer

Accessibility

New ENV

Approval beyond code review

Additional context needed to review

Code Reviewer

Code

Documentation

Testing

Uh oh!

jazairi commented Dec 3, 2025

Uh oh!

coveralls commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 19934295824

Details

💛 - Coveralls

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jazairi commented Nov 26, 2025 •

edited by atlassian bot

Loading

coveralls commented Dec 4, 2025 •

edited

Loading