Skip to content

[codex] Rewrite client connection architecture#2978

Draft
juliusmarminge wants to merge 5 commits into
codex/hosted-web-tracing-pocfrom
codex/connection-state-audit
Draft

[codex] Rewrite client connection architecture#2978
juliusmarminge wants to merge 5 commits into
codex/hosted-web-tracing-pocfrom
codex/connection-state-audit

Conversation

@juliusmarminge

@juliusmarminge juliusmarminge commented Jun 6, 2026

Copy link
Copy Markdown
Member

Summary

  • replace the legacy web and mobile connection runtimes with the shared Effect client-runtime architecture
  • centralize connection supervision, retry/backoff, online/offline handling, RPC session ownership, persistence, and projections
  • migrate web, mobile, and desktop integration points to the new connection catalog and runtime
  • add focused tests for transport lifecycle, supervision, registry ownership, subscriptions, authorization, and storage

Stack

Validation

  • vp check
  • vp run typecheck
  • vp run lint:mobile
  • client-runtime suite: 193 tests passed

The stacked tree is byte-for-byte identical to the original pre-split PR tip.

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 02e9cb5b-9214-4451-8038-f37e5a87e585

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/connection-state-audit

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Jun 6, 2026
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🚀 Expo continuous deployment is ready!

  • Project → t3-code
  • Platforms → android, ios
  • Scheme → t3code-preview
  🤖 Android 🍎 iOS
Fingerprint 30eb942851c2693b81dc607a4de2651c7bf7cc4c 092a8bb81d8fc363a745d8a7d891ae056a15f49b
Build Details Build Permalink
DetailsDistribution: INTERNAL
Build profile: preview:dev
Runtime version: 30eb942851c2693b81dc607a4de2651c7bf7cc4c
App version: 0.1.0
Git commit: 510617bcdf049968cbe7adedcee881bee3e8ed64
Build Permalink
DetailsDistribution: INTERNAL
Build profile: preview:dev
Runtime version: 092a8bb81d8fc363a745d8a7d891ae056a15f49b
App version: 0.1.0
Git commit: 510617bcdf049968cbe7adedcee881bee3e8ed64
Update Details Update Permalink
DetailsBranch: pr-2978
Runtime version: 30eb942851c2693b81dc607a4de2651c7bf7cc4c
Git commit: 510617bcdf049968cbe7adedcee881bee3e8ed64
Update Permalink
DetailsBranch: pr-2978
Runtime version: 092a8bb81d8fc363a745d8a7d891ae056a15f49b
Git commit: 510617bcdf049968cbe7adedcee881bee3e8ed64
Update QR

Comment thread apps/web/src/environments/runtime/service.ts Outdated
Comment thread apps/web/src/connection/webAppQueries.ts
Comment thread apps/web/src/connection/webConnectionStorage.ts
Comment thread apps/mobile/src/app/settings/environments.tsx
Comment thread apps/web/src/connection/webTerminalSessions.ts
Comment thread packages/client-runtime/src/connection/brokers.ts
Comment thread packages/client-runtime/src/connection/threads.ts
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from 16c9aba to 2a29e34 Compare June 7, 2026 19:53
@juliusmarminge juliusmarminge changed the title Harden remote connection state handling across mobile, web, and relay [codex] Rewrite client connection architecture Jun 7, 2026
@juliusmarminge juliusmarminge changed the base branch from main to codex/connection-infra-otel-base June 7, 2026 19:53
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from 2a29e34 to b976d5f Compare June 7, 2026 20:09
Comment on lines +87 to +113
const openDatabase = Effect.fn("web.connectionStorage.openDatabase")(function* () {
return yield* Effect.callback<IDBDatabase, ConnectionTransientError>((resume) => {
if (typeof indexedDB === "undefined") {
resume(
Effect.fail(catalogError("open", "IndexedDB is unavailable in this browser context.")),
);
return;
}
const request = indexedDB.open(DATABASE_NAME, DATABASE_VERSION);
request.addEventListener("upgradeneeded", () => {
if (!request.result.objectStoreNames.contains(CATALOG_STORE_NAME)) {
request.result.createObjectStore(CATALOG_STORE_NAME);
}
if (!request.result.objectStoreNames.contains(SHELL_STORE_NAME)) {
request.result.createObjectStore(SHELL_STORE_NAME);
}
if (!request.result.objectStoreNames.contains(THREAD_STORE_NAME)) {
request.result.createObjectStore(THREAD_STORE_NAME);
}
});
request.addEventListener("error", () => {
resume(Effect.fail(catalogError("open", request.error ?? "Unknown IndexedDB error")));
});
request.addEventListener("success", () => {
resume(Effect.succeed(request.result));
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium connection/webConnectionStorage.ts:87

When indexedDB.open() needs to upgrade the schema but another tab holds an older version, the blocked event fires and the Effect never completes — neither success nor error handlers execute until the blocking tab closes. If that tab never closes, openDatabase hangs indefinitely. Consider handling the blocked event by resuming with a descriptive error or triggering a retry with timeout.

+    request.addEventListener("blocked", () => {
+      resume(Effect.fail(catalogError("open", "Database upgrade blocked by another tab")));
+    });
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/connection/webConnectionStorage.ts around lines 87-113:

When `indexedDB.open()` needs to upgrade the schema but another tab holds an older version, the `blocked` event fires and the Effect never completes — neither `success` nor `error` handlers execute until the blocking tab closes. If that tab never closes, `openDatabase` hangs indefinitely. Consider handling the `blocked` event by resuming with a descriptive error or triggering a retry with timeout.

Evidence trail:
apps/web/src/connection/webConnectionStorage.ts lines 87-114 at REVIEWED_COMMIT: `openDatabase` registers handlers for `upgradeneeded` (line 96), `error` (line 107), and `success` (line 110), but not for `blocked`. IndexedDB spec: https://w3c.github.io/IndexedDB/#request-api — the `blocked` event fires on IDBOpenDBRequest when the open operation is blocked by existing connections, and `success`/`error` don't fire until the block is resolved.

@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 2 times, most recently from fb36d5e to 8de82fb Compare June 7, 2026 20:23
Comment thread apps/web/src/connection/webConnectionPlatform.ts
Comment thread apps/mobile/src/features/cloud/CloudAuthProvider.tsx Outdated
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 4 times, most recently from 9ba3552 to a01b7f9 Compare June 7, 2026 20:45
@juliusmarminge juliusmarminge force-pushed the codex/connection-infra-otel-base branch from ced8bdf to 22b4b24 Compare June 7, 2026 20:50
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 4 times, most recently from 409fd6c to ecd84a7 Compare June 7, 2026 21:06
Base automatically changed from codex/connection-infra-otel-base to main June 8, 2026 03:21
juliusmarminge and others added 2 commits June 7, 2026 21:02
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from ecd84a7 to f5c0afe Compare June 8, 2026 04:09
@juliusmarminge juliusmarminge changed the base branch from main to codex/hosted-web-tracing-poc June 8, 2026 04:09
…t and tracing

- Updated the EnvironmentSupervisor to handle connection states more effectively, including new states for connecting and available.
- Introduced tracing for connection attempts to capture detailed error information and improve debugging.
- Removed the environment runtime state management code as it was deemed unnecessary.
- Adjusted tests to reflect changes in connection state handling and ensure proper functionality.
- Enhanced error handling in relay tracing to ensure safe error propagation without affecting application behavior.
Comment thread packages/client-runtime/src/connection/registry.ts Outdated
- Refactored EnvironmentRegistry to utilize new EnvironmentServices and EnvironmentServicesFactory.
- Replaced runtime-related methods with service-based methods for better abstraction.
- Introduced new layers for environment services, including commands and threads.
- Removed deprecated rpcGenerationChanges and other unused methods from EnvironmentRegistryService.
- Updated tests to reflect changes in the registry and runtime structure.
- Cleaned up unused imports and adjusted types accordingly.
…ment

- Deleted filesystemBrowseState and sourceControlDiscoveryState modules along with their associated tests.
- Removed related imports from index.ts and knownEnvironment.ts.
- Updated knownEnvironment tests to remove unused HTTP base URL checks.
- Refactored shellSnapshotState and threadDetailState to simplify interfaces.
- Cleaned up vcsRefState and vcsStatusState by removing unused types and functions.
);
if (!isSignedIn || !userId) {
setManagedRelaySession(appAtomRegistry, null);
if (previousAccount !== null) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium cloud/managedAuth.tsx:59

On first render when signed out, previousAccount is undefined, so undefined !== null evaluates to true and queueAccountCleanup() runs unnecessarily. This triggers removeRelayEnvironments() and relay client token cache reset even though no account was ever established. Change the condition to if (previousAccount) so cleanup only happens when transitioning from an actual signed-in state.

-      if (previousAccount !== null) {
+      if (previousAccount) {
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/cloud/managedAuth.tsx around line 59:

On first render when signed out, `previousAccount` is `undefined`, so `undefined !== null` evaluates to `true` and `queueAccountCleanup()` runs unnecessarily. This triggers `removeRelayEnvironments()` and relay client token cache reset even though no account was ever established. Change the condition to `if (previousAccount)` so cleanup only happens when transitioning from an actual signed-in state.

Evidence trail:
apps/web/src/cloud/managedAuth.tsx lines 26, 35, 37, 57-61, 63 at REVIEWED_COMMIT. Line 26: ref initialized to `undefined`. Line 35: `previousAccount = observedAccountRef.current` (undefined on first render). Line 59: condition `previousAccount !== null` doesn't account for `undefined`. Line 63: correctly checks all three states (`!== undefined && !== null && !== userId`).

? (activeSavedEnvironmentRuntime?.connectionState ?? "disconnected")
: "connected";
const primaryEnvironmentId = primaryEnvironment?.environmentId ?? null;
const activeEnvironment =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium components/ChatView.tsx:1119

The refactored code removed the guard that excluded the primary environment from unavailability checks. Now activeEnvironmentUnavailable is computed for all environments including the primary, so if the primary's connection.phase is temporarily "connecting" or "available" during startup, the UI shows an unavailable banner and blocks message dispatch — behavior that never occurred before.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ChatView.tsx around line 1119:

The refactored code removed the guard that excluded the primary environment from unavailability checks. Now `activeEnvironmentUnavailable` is computed for all environments including the primary, so if the primary's `connection.phase` is temporarily `"connecting"` or `"available"` during startup, the UI shows an unavailable banner and blocks message dispatch — behavior that never occurred before.

Evidence trail:
Old guard at MERGE_BASE in ChatView.tsx: `activeThread.environmentId !== primaryEnvironmentId` check in `activeSavedEnvironmentRecord` assignment; new code at REVIEWED_COMMIT ChatView.tsx lines 1119-1123 with no primary guard; `AVAILABLE_CONNECTION_STATE` defined at packages/client-runtime/src/connection/model.ts:148-154 with `phase: "available"`; `activeEnvironmentUnavailable` used to block sends at ChatView.tsx:2806, ChatView.tsx:3427; used to block revert at ChatView.tsx:2746; used for banner at ChatView.tsx:1353; `isConnecting` always false at ChatView.tsx:904 (`_setIsConnecting` unused).

(patch: Partial<UnifiedSettings>) => {
const { serverPatch, clientPatch } = splitPatch(patch);

if (Object.keys(serverPatch).length > 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium hooks/useSettings.ts:204

When primaryEnvironment is null, server settings are optimistically applied locally via applySettingsUpdated() but the RPC call to serverActions.updateSettings() is skipped. The user sees the change in the UI, but on refresh the setting reverts because it was never persisted to the server.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/hooks/useSettings.ts around line 204:

When `primaryEnvironment` is `null`, server settings are optimistically applied locally via `applySettingsUpdated()` but the RPC call to `serverActions.updateSettings()` is skipped. The user sees the change in the UI, but on refresh the setting reverts because it was never persisted to the server.

Evidence trail:
apps/web/src/hooks/useSettings.ts lines 197-225 (REVIEWED_COMMIT) — independent guards for optimistic update (line 206: checks `currentServerConfig`) vs RPC persist (line 209: checks `primaryEnvironment`). apps/web/src/connection/useWebEnvironments.ts lines 59-68 — `useWebPrimaryEnvironment()` returns `WebEnvironmentPresentation | null`. apps/web/src/rpc/serverState.ts line 58 — `serverConfigAtom` initialized to `null`. apps/web/src/rpc/serverState.ts line 176-178 — `resolveServerConfig` sets the atom. apps/web/src/connection/WebServerStateProjection.tsx lines 27-43 — sets config snapshot from primary environment data, but never clears atom when `environmentId` becomes null. apps/web/src/rpc/serverState.ts lines 169-172 — only test reset exists, no production clearing of stale config.

Comment on lines +17 to +22
function formatError(cause: Cause.Cause<unknown>): string {
const error = Cause.squash(cause);
return error instanceof Error && error.message.trim().length > 0
? error.message
: "The environment request failed.";
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low connection/webEnvironmentQuery.ts:17

When Cause.squash(cause) returns a plain string (e.g., from Cause.fail("some message")), the instanceof Error check fails and the function returns the generic fallback message, discarding the actual error value. Consider checking for non-Error return values from squash and converting them to string to preserve the error information.

function formatError(cause: Cause.Cause<unknown>): string {
-  const error = Cause.squash(cause);
-  return error instanceof Error && error.message.trim().length > 0
-    ? error.message
-    : "The environment request failed.";
+  const error = Cause.squash(cause);
+  if (error instanceof Error) {
+    return error.message.trim().length > 0 ? error.message : "The environment request failed.";
+  }
+  return String(error);
}
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/connection/webEnvironmentQuery.ts around lines 17-22:

When `Cause.squash(cause)` returns a plain string (e.g., from `Cause.fail("some message")`), the `instanceof Error` check fails and the function returns the generic fallback message, discarding the actual error value. Consider checking for non-Error return values from `squash` and converting them to string to preserve the error information.

Evidence trail:
- `apps/web/src/connection/webEnvironmentQuery.ts` lines 17-22: `formatError` function that calls `Cause.squash(cause)` and checks `error instanceof Error`
- Effect library source `packages/effect/src/Cause.ts`: `squash` documentation states "it returns the raw error value" for failures, return type is `unknown` (`export const squash: <E>(self: Cause<E>) => unknown`)
- Effect documentation at https://effect.website/docs/getting-started/creating-effects/ confirms `Effect.fail` can accept strings, numbers, or objects
- Same pattern appears in `packages/client-runtime/src/connection/threads.ts:52-57` and `apps/web/src/connection/webAppQueries.ts:170-174`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

When the toast is created while primaryEnvironment is loading (null), clicking "Update" silently does nothing because runUpdates captures the stale null value in its closure. The effect re-runs when primaryEnvironment resolves, but it early-returns at line 174 because activeToastRef.current is already set, so the callback is never recreated with the resolved environment value.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ProviderUpdateLaunchNotification.tsx around line 186:

When the toast is created while `primaryEnvironment` is loading (`null`), clicking "Update" silently does nothing because `runUpdates` captures the stale `null` value in its closure. The effect re-runs when `primaryEnvironment` resolves, but it early-returns at line 174 because `activeToastRef.current` is already set, so the callback is never recreated with the resolved environment value.

Evidence trail:
apps/web/src/components/ProviderUpdateLaunchNotification.tsx lines 163-305 (REVIEWED_COMMIT): The useEffect at line 163 has primaryEnvironment in its deps (line 303). runUpdates at line 190 captures primaryEnvironment from closure; line 191 checks `!primaryEnvironment` and returns early if null. Line 179 adds notificationKey to seenProviderUpdateNotificationKeys. Line 173 checks seenProviderUpdateNotificationKeys.has(notificationKey), and line 174 checks activeToastRef.current — both are truthy on re-run, causing early return. The toast callback at line 269 (onClick: runUpdates) is never recreated with the resolved primaryEnvironment value.

[threadActions],
);

const onCreateThread = useCallback(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High threads/use-project-actions.ts:111

onCreateThread calls onCreateThreadWithOptions with initialMessageText: "" (line 133), but onCreateThreadWithOptions returns null when initialMessageText.trim().length === 0 (lines 54-56). This causes onCreateThread to always return null without creating a thread, making the function useless.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/mobile/src/features/threads/use-project-actions.ts around line 111:

`onCreateThread` calls `onCreateThreadWithOptions` with `initialMessageText: ""` (line 133), but `onCreateThreadWithOptions` returns `null` when `initialMessageText.trim().length === 0` (lines 54-56). This causes `onCreateThread` to always return `null` without creating a thread, making the function useless.

Evidence trail:
apps/mobile/src/features/threads/use-project-actions.ts lines 51-56 (trim + length check), lines 111-135 (onCreateThread passes initialMessageText: ""), lines 140-143 (function exported from hook). git_grep confirms onCreateThread is not currently called externally but is returned for use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant