Skip to content

Entry::descendants() is O(localizations) queries; multisite link/redirect resolution scales with site count #14767

@SteveEdson

Description

@SteveEdson

Bug description

On a multisite install, Statamic\Entries\Entry::descendants() resolves localizations recursively and fires one where origin = ? query per localization. This makes resolving a single localization O(number of localizations) in queries.

The hot path is Entry::in($locale):

public function in($locale)
{
    if ($locale === $this->locale()) {
        return $this;
    }

    if (! $this->isRoot()) {
        return $this->root()->in($locale);
    }

    return $this->descendants()->get($locale); // loads + recursively walks the WHOLE tree to pick one locale
}

and descendants() itself:

public function descendants()
{
    $localizations = $this->directDescendants();          // 1 query

    foreach ($localizations as $loc) {
        $localizations = $localizations->merge($loc->descendants()); // 1 query EACH, recursively
    }

    return $localizations;
}

directDescendants() is Blink cached per entry, but the recursion still issues a separate query for every node in the descendant tree, including the leaf localizations that return nothing.

Why it matters in practice

Routing\ResolveRedirect calls $entry->in($site) for every entry link it resolves (Link and entries fieldtype values, CTAs, nav link targets, etc). So on a page with several entry links, the cost is roughly (number of links) x (number of localizations of each linked entry).

On a real multisite project (around 70 sites) this produced:

  • A single linked entry resolved in in($site): about 70 queries.
  • A home page with a handful of CTAs and nav links: 587 queries (measured), most of them select * from entries where collection = ? and origin_id = ?.
  • Even 404 pages hit around 558 queries because the chrome still resolves links.

This is separate from the trait based slowdown discussed in #10157 (which is about first load / stache warming) and the control panel issue in #10429. This one is a per request, front end render cost that scales with site count.

Environment

  • statamic/cms 6.20.0
  • statamic/eloquent-driver 5.9.0 (the methods above are core and inherited unchanged, so this is not driver specific)
  • PHP 8.4, Laravel 12

Steps to reproduce

  1. Multisite install with a structured collection localized into many sites (the more sites, the clearer the effect).
  2. A page with one or more entry link fields (or a nav with entry:: link targets).
  3. Enable barryvdh/laravel-debugbar and load the page.
  4. Observe the Queries tab: select * from entries where collection = ? and origin_id = ? repeated once per localization, per resolved link.

Minimal isolation in tinker against a root entry with N localizations:

$root = Entry::find($rootId);
DB::enableQueryLog();
$root->descendants();
count(DB::getQueryLog()); // ~ N

Proposed fix

Keep the already cached and invalidated directDescendants() for level one, then fetch each deeper level with a single batched whereIn instead of recursing node by node:

public function descendants()
{
    $localizations = $this->directDescendants();   // level 1, unchanged (Blink cached + invalidated)
    $origins = $localizations->map->id()->all();

    while (! empty($origins)) {
        $children = Facades\Entry::query()
            ->where('collection', $this->collectionHandle())
            ->whereIn('origin', $origins)
            ->get();

        if ($children->isEmpty()) {
            break;
        }

        $localizations = $localizations->merge($children->keyBy->locale());
        $origins = $children->map->id()->all();
    }

    return $localizations;
}

This is O(depth) queries instead of O(nodes). For the common flat localization tree (every localization points directly at the root) it is about 2 queries regardless of how many sites exist. It is driver agnostic (whereIn works on both the Stache and Eloquent query builders), touches no Blink keys or invalidation logic, and leaves directDescendants() intact for callers that only want direct children.

Verification

Prototyped against the real dataset on 6.20.0. A root with 41 localizations:

  • Current descendants(): 44 queries
  • Batched version above: 4 queries
  • Identical result: same set of locales and the same entry IDs.

Happy to open a PR with this change plus a regression test asserting the query count is O(depth) and the resulting set is unchanged, if the approach looks right to the team.

Related: #10157, #10429, #2396.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions