Skip to content

Fix pagination in Content Delivery API Index Helper #19606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 1, 2025

Conversation

Brynjarth
Copy link
Contributor

@Brynjarth Brynjarth commented Jun 25, 2025

Resolves: #18683

Description

When rebuilding the Content Delivery API Index it only fetches 10.000 descendants at maximum currently. The logic that was used was flawed and was always stopping at 10.000 items.
This fix updates the logic so that it can fetch all of the descendants and insert them into the index.
We tested this on our own website and this fix is currently working in production.

Improved loop condition to allow for processing of more than 10.000 descendants for indexing.
Copy link

github-actions bot commented Jun 25, 2025

Hi there @Brynjarth, thank you for this contribution! 👍

While we wait for one of the Core Collaborators team to have a look at your work, we wanted to let you know about that we have a checklist for some of the things we will consider during review:

  • It's clear what problem this is solving, there's a connected issue or a description of what the changes do and how to test them
  • The automated tests all pass (see "Checks" tab on this PR)
  • The level of security for this contribution is the same or improved
  • The level of performance for this contribution is the same or improved
  • Avoids creating breaking changes; note that behavioral changes might also be perceived as breaking
  • If this is a new feature, Umbraco HQ provided guidance on the implementation beforehand
  • 💡 The contribution looks original and the contributor is presumably allowed to share it

Don't worry if you got something wrong. We like to think of a pull request as the start of a conversation, we're happy to provide guidance on improving your contribution.

If you realize that you might want to make some changes then you can do that by adding new commits to the branch you created for this work and pushing new commits. They should then automatically show up as updates to this pull request.

Thanks, from your friendly Umbraco GitHub bot 🤖 🙂

@emmagarland
Copy link
Contributor

Hi @Brynjarth

Thanks for your PR to resolve #18683 with the maximum amount of results.

One of the core contributors team will take a look - I think it's one HQ might want to confirm too since it was in discussion on the issue thread too.

Cheers,

Emma

Copy link
Contributor

@AndyButland AndyButland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to see here what the problem was the original code, which makes me nervous to accept the change, even if verified in production, without having understood where the issue was.

Can you help me understand the logic error with the original code please?

Let's assume 25000 documents.

First time through, pageIndex = 0, pageSize = 10000 - retrieves 10000 records, so descendants.Length == pageSize and we loop again with pageIndex incremented.

Second time through, pageIndex = 1, pageSize = 10000 - retrieves a second batch of 10000 records, so descendants.Length == pageSize and we loop again with pageIndex incremented.

Third time through, pageIndex = 2, pageSize = 10000 - retrieves the last 5000 records, so descendants.Length != pageSize and we stop.

@gardarthorsteins
Copy link

Hi Andy,

Its actually this that fixes the issue: while (descendants.Length > 0 && pageIndex < total);

The problem is that the current code is looking for an exact match while (descendants.Length == pageSize);

But .Where(descendant => _deliveryApiSettings.IsAllowedContentType(descendant.ContentType.Alias)) is filtering out values so the end results is less then 10000. This breaks the loop.

My guess is that only this index has this specific where command and why its the only one breaking and not the other ones.

This means that only projects with higher then 10000 nodes and are using the DisallowedContentTypeAliases feature are having this issue.

This is tested with version 13.8.1

@AndyButland

@AndyButland
Copy link
Contributor

Thanks @gardarthorsteins for the explanation - makes sense. I added an initially failing integration test to first ensure we could replicate the problem via that means, and then to verify your implementation, which looks to work as expected.

@AndyButland AndyButland merged commit 1f5c21c into umbraco:v13/main Jul 1, 2025
17 checks passed
AndyButland added a commit that referenced this pull request Jul 1, 2025
* Refactor descendant enumeration in DeliveryApiContentIndexHelper

Improved loop condition to allow for processing of more than 10.000 descendants for indexing.

* Add failing test for original issue.

* Renamed variable for clarity.

---------

Co-authored-by: Brynjar Þorsteinsson <brynjar@vettvangur.is>
Co-authored-by: Andy Butland <abutland73@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants