Skip to content

Conversation

@jazairi
Copy link
Contributor

@jazairi jazairi commented Nov 26, 2025

Why these changes are being introduced:

The zipper merge we implemented naively queries n/2 results from each API and interleaves them, where n is the per-page value. This works if both APIs return many results, but it can cause problems in smaller, unbalanced result sets.

For example, the query term doc edgerton returns 50 Primo results and 4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo), and each subsequent page returns only 10 (all Primo).

Relevant ticket(s):

How this addresses that need:

This implements more sophisticated logic that first checks the number of hits returned by each API and passes that, along with the pagination information, to a Merged Search Paginator class. This service object develops a 'merge plan', calculates API offsets, and merges the results for each page.

Queries on the 'all' tab now fetch twice from each API: once to determine the total number of hits for the Merged Search Paginator then again to fetch results at the appropriate offset. While hardly ideal, this was the only option I could figure to avoid losing results. I limited these extra calls to queries beyond page 1, which is the only case where they are needed.

Side effects of this change:

  • We now clear cache before each search controller test. This was done to avoid odd test behavior, but I ran the suite 50 times without any issues, so it might be excessively cautious.
  • The search controller continues to grow with this new logic. I tried to split things into multiple helper methods, so if we want to move more things to service objects later, it might be easier to do so.
  • A failing cassette has been replaced with a mock.

Developer

Accessibility
  • ANDI or WAVE has been run in accordance to our guide.
  • This PR contains no changes to the view layer.
  • New issues flagged by ANDI or WAVE have been resolved.
  • New issues flagged by ANDI or WAVE have been ticketed (link in the Pull Request details above).
  • No new accessibility issues have been flagged.
New ENV
  • All new ENV is documented in README.
  • All new ENV has been added to Heroku Pipeline, Staging and Prod.
  • ENV has not changed.
Approval beyond code review
  • UXWS/stakeholder approval has been confirmed.
  • UXWS/stakeholder review will be completed retroactively.
  • UXWS/stakeholder review is not needed.
Additional context needed to review

This is a pretty unwieldy changeset, so please reach out if you have questions!

Code Reviewer

Code
  • I have confirmed that the code works as intended.
  • Any CodeClimate issues have been fixed or confirmed as
    added technical debt.
Documentation
  • The commit message is clear and follows our guidelines
    (not just this pull request message).
  • The documentation has been updated or is unnecessary.
  • New dependencies are appropriate or there were no changes.
Testing
  • There are appropriate tests covering any new functionality.
  • No additional test coverage is required.

@mitlib mitlib temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 November 26, 2025 20:49 Inactive
@jazairi jazairi force-pushed the use-179-pagination-bug branch from bea3fd9 to 7e80c7d Compare December 1, 2025 14:37
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 1, 2025 14:37 Inactive
@jazairi jazairi force-pushed the use-179-pagination-bug branch from 7e80c7d to 3584c1b Compare December 1, 2025 14:44
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 1, 2025 14:44 Inactive
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 3, 2025 20:33 Inactive
@jazairi jazairi force-pushed the use-179-pagination-bug branch from 1e7cf2d to 91ad3c3 Compare December 3, 2025 20:35
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 3, 2025 20:36 Inactive
@jazairi
Copy link
Contributor Author

jazairi commented Dec 3, 2025

@JPrevost Let me know what you think of the refactor I just pushed up. I know we talked about just adding caching, but this felt like a could opportunity to move the 'all' tab logic to a service and slim down the controller. I find it a bit easier to follow, and the decoupling will become useful if we do move to an external orchestrator. I guess the downside is now we have a 200+ line service. 🙃

I left this a separate commit in case you wanted to see the differences between the first and second efforts. Happy to squash before review instead if that's easier.

@JPrevost JPrevost self-assigned this Dec 4, 2025
Why these changes are being introduced:

The zipper merge we implemented naively queries n/2 results from each
API and interleaves them, where n is the per-page value. This works if
both APIs return many results, but it can cause problems in smaller,
unbalanced result sets.

For example, the query term `doc edgerton` returns 50 Primo results and
4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo),
and each subsequent page returns only 10 (all Primo).

Relevant ticket(s):

- [USE-179](https://mitlibraries.atlassian.net/browse/USE-179)

How this addresses that need:

This implements more sophisticated logic that first checks the number
of hits returned by each API and passes that, along with the pagination
information, to a Merged Search Paginator class. This service object
develops a 'merge plan', calculates API offsets, and merges the results
for each page.

Queries on the 'all' tab now fetch twice from each API: once to
determine the total number of hits for the Merged Search Paginator
then again to fetch results at the appropriate offset. While hardly
ideal, this was the only option I could figure to avoid losing results.
I limited these extra calls to queries beyond page 1, which is the
only case where they are needed.

Side effects of this change:

* We now clear cache before each search controller test. This was done
to avoid odd test behavior, but I ran the suite 50 times without any
issues, so it might be excessively cautious.
* The search controller continues to grow with this new logic. I tried
to split things into multiple helper methods, so if we want to move
more things to service objects later, it might be easier to do so.
* A failing cassette has been replaced with a mock.
Why these changes are being introduced:

In discussions of the PR in review for USE-179, we determined that the
proposed pagination improvements could be more efficient. We had also
determined that the code changes were difficult to follow, and could
use better documentation.

Relevant ticket(s):

- USE-179

How this addresses that need:

This commit caches the page 1 'summary' API calls, which we use to
gather the hit counts from each API to calculate pagination on deeper
pages. It also abstracts the 'all' tab code to a 'Merged Search Service',
mirroring the design pattern of the Merged Search Paginator, and adds
docstrings to the methods in that service.

Side effects of this change:

We are still making two API calls on deeper pages when the hit totals
are not cached. I could not find a workaround to this while still
supporting nonlinear pagination. However, the vast majority of users
(even bots, presumably) will begin their search at page 1, so hopefully
this is a rare occurrence.
@jazairi jazairi force-pushed the use-179-pagination-bug branch from 91ad3c3 to 0106e5f Compare December 4, 2025 14:04
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-179-pa-olfde1 December 4, 2025 14:04 Inactive
@coveralls
Copy link

coveralls commented Dec 4, 2025

Pull Request Test Coverage Report for Build 19934295824

Details

  • 169 of 169 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 98.127%

Totals Coverage Status
Change from base Build 19906931799: 0.3%
Covered Lines: 1100
Relevant Lines: 1121

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants