Skip to content

perf: parallelize getAllDescendantPages sibling fetches#113

Merged
pchuri merged 2 commits intomainfrom
perf/parallelize-descendant-fetch
Apr 22, 2026
Merged

perf: parallelize getAllDescendantPages sibling fetches#113
pchuri merged 2 commits intomainfrom
perf/parallelize-descendant-fetch

Conversation

@pchuri
Copy link
Copy Markdown
Owner

@pchuri pchuri commented Apr 22, 2026

Pull Request Template

Description

getAllDescendantPages previously awaited each child's recursive fetch sequentially, producing N sequential API calls for an N-node subtree (a classic N+1 pattern). For deep or wide page trees this dominates wall-clock time on export, copy, and tree-traversal commands.

This PR introduces a single semaphore shared across the entire traversal that caps concurrent getChildPages requests at 10, regardless of tree shape.

Revision after review (commit f6e637c): an earlier version of this PR placed the concurrency cap inside each recursive frame, so the limit only applied per parent. For a tree with branching factor B and depth D, concurrent requests could grow to min(B, 10)^D — exactly the request flood the cap was meant to prevent. The fix threads a single createSemaphore(10) instance through a private _collectDescendants helper; every getChildPages call in the traversal acquires against that one semaphore, giving a hard global cap.

Preserved behavior:

  • Public method signature (getAllDescendantPages(pageId, maxDepth, currentDepth))
  • Return order: DFS pre-order (current-level children first, then each child's descendants in original order)
  • parentId attachment on each descendant
  • maxDepth short-circuit
  • Total API call count

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring

Testing

  • Tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Added getAllDescendantPages caps concurrent getChildPages across the whole traversal in tests/confluence-client.test.js: builds a 15-wide, 2-deep tree (240 descendants), instruments getChildPages with an in-flight counter, and asserts the observed peak stays ≤ 10. This test would have failed on the previous per-parent implementation (peak ~100) and passes on the current one.

Full suite: npx jest → 236 tests pass.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Additional Context

Why a semaphore instead of a library like p-limit? createSemaphore is ~15 lines of zero-dependency code and is the only place in this codebase that needs a limiter. A dependency seemed like overreach.

Why not switch to a single CQL ancestor = X query? That's a larger behavior change (different pagination semantics, loss of per-level parentId attribution without a second pass) and deserves its own PR. This change is the minimal fix for the N+1 pattern.

pchuri added 2 commits April 22, 2026 22:29
Sibling pages at each tree level are independent, but the previous
implementation awaited each child's recursive fetch sequentially,
producing N sequential API calls for an N-node tree.

Fetch siblings in chunks of 10 with Promise.all. Wall-clock time drops
from O(N) to roughly O(depth * ceil(width/10)); total API call count is
unchanged. Public signature, result order, parentId attachment, and
maxDepth behavior are all preserved.
The previous commit created SIBLING_CONCURRENCY inside every recursive
frame, so the cap of 10 was enforced per parent, not per traversal. In
a tree with branching factor B and depth D, concurrent getChildPages
requests could grow to min(B, 10)^D rather than staying at 10.

Replace the per-frame chunking with a single semaphore created at the
top-level call and threaded through a private _collectDescendants
helper. Every getChildPages request in the traversal now acquires
against the same semaphore, giving a hard cap of 10 concurrent
in-flight requests regardless of tree shape. Public signature, DFS
pre-order result, parentId attachment, and maxDepth behavior are
unchanged.

Add a test that builds a 15-wide, 2-deep tree, counts concurrent
getChildPages invocations, and asserts the observed peak stays at or
below 10 — this would have failed on the previous per-parent cap
(peak ~100) and now passes.
@pchuri pchuri merged commit bdee5de into main Apr 22, 2026
6 checks passed
@pchuri pchuri deleted the perf/parallelize-descendant-fetch branch April 22, 2026 13:52
github-actions Bot pushed a commit that referenced this pull request Apr 22, 2026
## [1.31.1](v1.31.0...v1.31.1) (2026-04-22)

### Performance Improvements

* parallelize getAllDescendantPages sibling fetches ([#113](#113)) ([bdee5de](bdee5de))
@github-actions
Copy link
Copy Markdown

🎉 This PR is included in version 1.31.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant