Skip to content

Navigation Architecture #918

@NathanDrake2406

Description

@NathanDrake2406

I'm quite unhappy with #876, so I did some research on navigation architecture from other frameworks. Please let me know ur thoughts. Would be great if we could ask bonk for an adversarial review as well :)

Context

Recent App Router fixes moved us in the right direction:

The recurring smell is that we are solving each race locally: activeNavigationId checks, pending promise settlement, popstate handling, same-URL server action commits, cache seeding, and hard-navigation recovery are spread across the browser entry and navigation shim.

I think the deeper primitive should be:

Async navigation work produces candidate results. A single navigation lifecycle owner decides whether those results may commit visible route state.

Existing Vinext Prior Work

#690 is important context because it was already a major App Router navigation rework, not just a bug patch. It introduced the two-phase navigation model: same-route changes can stay inside startTransition, while cross-route changes use synchronous updates to avoid the Firefox scheduler hang. It also moved URL/history commit into a layout-effect lifecycle, added navigation IDs for stale bailouts, added render snapshots for hook consistency, and added visited RSC response caching. That PR shows the shape of the real problem: visible route commits span React scheduling, URL/history effects, snapshots, cache state, and stale async work.

#745 is the smaller version of the same lesson. Because #690 defers URL commits, window.location.pathname can be stale during rapid A → B → C navigations. #745 fixed that by making the pending destination explicit via pendingPathname, with navId ownership so superseded navigations cannot clear the active navigation's pending state. That is already a controller-like idea: represent the in-flight intent directly, and only let the owner settle it.

So I do not think the proposal below should undo #690 or #745. It should consolidate their lessons into one lifecycle owner instead of adding one more local guard for every new race.

Code Reality Check

After reading the current navigation code, I think this proposal is realistic because the primitives already exist; they are just not owned by one boundary yet.

Current operation identity already exists as activeNavigationId plus pending browser-router state in app-browser-entry.ts. Programmatic pending state is published through PendingBrowserRouterState, beginPendingBrowserRouterState(), settlePendingBrowserRouterState(), and resolvePendingBrowserRouterState(). That is already an operation lifecycle; it just is not named as one.

The candidate-commit seam already exists in app-browser-state.ts. createPendingNavigationCommit() builds a commit-ready router action, and resolvePendingNavigationCommitDisposition() classifies it as dispatch, hard-navigate, or skip. That maps very closely to the proposed rule: async work produces a candidate result, then lifecycle logic decides whether it may commit.

#745's pendingPathname is already explicit pending intent in navigation.ts. It exists because committed window.location can intentionally lag behind the navigation being rendered. That is the same philosophy as a controller: visible intent needs a durable owner instead of re-reading incidental browser state.

Prefetch is already close to the proposed cache-only lane. prefetchRscResponse() snapshots RSC responses into cache, while consumePrefetchResponse() only hands compatible settled snapshots to a later navigation; prefetch itself does not commit visible UI. The proposal should preserve that shape and make it explicit.

The back/forward gap is also visible in the code today. router.back() and router.forward() currently call window.history.back() / window.history.forward() directly, and the App Router popstate listener starts the "traverse" RSC navigation after the browser event. That means there is no traversal intent before the synchronous history call, which explains why #876 needed extra machinery to keep isPending latched.

The server-action gap is explicitly documented in commitSameUrlNavigatePayload(): activeNavigationId is not strong enough if a same-URL navigation fully commits while a server action is awaiting its pending commit. That is the strongest evidence for adding visibleCommitVersion rather than continuing to patch around activeNavigationId.

So the implementation direction should not be “replace the navigation system.” It should be “make the existing architecture explicit”: wrap the current operation ID, pending intent, candidate commit, pending promise, snapshot lifecycle, and cache-only prefetch behavior behind one lifecycle owner.

Philosophy

The visible route should have one commit authority.

A late RSC response resolving is not inherently wrong. A late server action resolving is not inherently wrong. A late prefetch resolving is not wrong. The bug is letting resolution imply authority to mutate visible state.

So the core invariant should be:

Only the current visible operation may commit URL, router tree, client params/search/path snapshots, pending pathname, scroll/focus side effects, and transition promise resolution.

Abort is useful, but abort is not correctness. React Router's concurrency docs call out the same underlying web-platform reality: canceling a browser request releases client resources, but the request may still reach the server, so stale work still needs commit/revalidation rules rather than relying on abort alone (React Router: Network Concurrency Management).

Prior Art

React Router / Remix have the cleanest product semantics: latest navigation wins, interrupted requests are cancelled, and stale revalidation results are discarded. Their docs explicitly frame this as browser-like concurrency management: the latest link click or form submission takes priority, while older work is cancelled or prevented from committing stale data.

Next.js App Router has the RSC-specific lesson: centralize actions in a queue, mark superseded actions as discarded, and let navigation/restore actions preempt pending work. Its server action reducer also records when a server action revalidated data so the queue can trigger a refresh if that action was discarded instead of applying stale state.

TanStack Router separates normal loads from preloads and gives preloads separate freshness/cache behavior. That maps well to our “prefetch seeds cache only” rule.

Angular and Vue Router both expose typed cancellation/failure reasons such as superseded/cancelled/aborted. That suggests our internal lifecycle should use explicit terminal states rather than boolean flags and early returns.

SvelteKit's navigation APIs are also useful prior art: it exposes navigation lifecycle hooks (beforeNavigate, afterNavigate, onNavigate) plus explicit invalidation/preload/refresh APIs, which reinforces that navigation, preloading, invalidation, and refresh should be named lifecycle concepts rather than incidental branches.

Proposed Model

Do not replace the current App Router navigation architecture. Extract and centralize the lifecycle it already has.

The first version of the controller should wrap existing primitives rather than replacing them:

  • keep ClientNavigationState as the hook/external-store layer
  • keep routerReducer() and AppRouterAction as the tree update mechanism
  • keep createPendingNavigationCommit() / resolvePendingNavigationCommitDisposition() as the candidate-commit seam
  • move operation identity, pending promise ownership, commit permission, terminal state, and same-URL commit versioning into one lifecycle owner

The controller should understand lanes as policy labels, not necessarily separate queues or classes on day one:

  • visible: push, replace, link navigation, redirect continuation
  • traverse: browser back/forward
  • refresh: same-URL visible revalidation
  • action: server action POST plus optional RSC patch/redirect/revalidation
  • prefetch: background cache fill only
  • recovery: hydration/HMR/hard-navigation recovery

It should expose terminal states as real data:

  • committed
  • superseded
  • aborted
  • failed
  • hard-navigated
  • cache-seeded
  • refresh-scheduled

The important rules:

  1. A newer visible operation supersedes older visible work.
  2. Prefetch never commits visible UI. It may only seed compatible cache entries.
  3. Server actions may return values, redirect, invalidate, seed cache, or schedule refresh.
  4. A server action must not patch visible route state after a newer visible commit.
  5. Refresh is a real operation, not a special branch.
  6. RSC redirects stay inside one operation lifecycle.
  7. Same-URL commits need a visibleCommitVersion, not just activeNavigationId.

That last point matters because URL-based or navigation-id-only checks are too weak. A same-URL server action/refresh can change visible route state without changing the URL. If an older action resumes after that, it needs to know its base visible commit is stale.

Back/Forward

router.back() and router.forward() are the awkward case because the call is synchronous and the popstate arrives later.

#876 found the real requirement: to keep useTransition().isPending alive, we need to arm pending state inside the caller’s transition before calling history.back/forward.

But that should be modeled as a traversal intent, not a global FIFO queue in the shim.

The browser's newer Navigation API gives us some useful information here: it centralizes navigation/history handling and exposes current/nearby history entry details like currentEntry, entries(), canGoBack, and canGoForward. Where that API can prove the traversal is possible and same-document, we can safely arm pending. Where it cannot, we should degrade deliberately rather than guessing.

Suggested shape:

  • router.back/forward asks the lifecycle controller to create a traversal intent.
  • If the Navigation API can prove same-document traversal and expected entry/index, arm pending.
  • If the traversal is known no-op, do not arm.
  • If the browser lacks enough introspection, degrade deliberately instead of guessing.
  • On popstate, match the browser event/current entry to the traversal intent.
  • The matched traversal enters the same commit barrier as push/replace.

What To Salvage From #876

Keep:

  • E2E scenarios for pending continuity across back/forward
  • rapid back/back and back/forward cases
  • no-op traversal cases
  • StrictMode readiness/mount cleanup lessons
  • Navigation API entry/index probing, but move it behind a small history adapter

Do not keep as architecture:

  • traversal pending FIFO as global truth
  • optimistic traversal offsets owned by the shim
  • more scattered navId !== activeNavigationId checks as the main model
  • boolean programmaticTransition as a domain concept

Acceptance Criteria

This issue is solved when these behaviors are structurally true, not just patched case-by-case:

  • newer navigation beats older RSC response
  • old RSC response can resolve late without committing visible state
  • prefetch can resolve late and seed cache only
  • server action resolving after newer visible commit cannot clobber the route
  • discarded revalidating server action schedules explicit refresh
  • refresh can be superseded like any other visible operation
  • RSC redirect chains keep one pending lifecycle
  • hard-navigation recovery only fires for the current operation
  • back/forward pending continuity works where the platform gives us enough history information
  • no-op back/forward cannot leave pending stuck

Review Questions

  • Do we agree that App Router needs one visible commit authority?
  • Should visibleCommitVersion be the primitive for same-URL/server-action races?
  • Should unsupported traversal introspection degrade rather than guess?
  • Should discarded server-action revalidation schedule refresh like Next?
  • How much of the current #690 / #745 lifecycle machinery should become part of the controller boundary versus remain in the browser entry/shim?
  • Do we agree that the controller should wrap the existing candidate-commit seam (createPendingNavigationCommit() / resolvePendingNavigationCommitDisposition()) rather than replacing the reducer/snapshot machinery?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions