Skip to content

History

Revisions

  • audit Phase-J: empty states for biometrics + advanced tab Before the first capture frame arrives, both the consumer biometrics card and the advanced developer tab now render "Start a session to ..." placeholders. Without this, the BPM / HRV / BLK numerics read as "--" and the signal-quality bars sit at zero — both states are easily mis-read as "Cortex is stuck" rather than "Cortex hasn't started yet". The audit-w2 reconciliation noted the timeline panel already had a "No events yet" empty state but the BPM card and the dev-debug widgets did not. Phase J-3 closes that gap on both surfaces. Contract: * ``_ConsumerTab._bio_empty_state`` (QLabel) inside the biometrics card carries "Start a session to see your biometrics." Hidden the moment ``update_state`` is called for the first time. * ``_AdvancedTab._empty_state`` carries the longer "Start a session to populate signal quality, heart-rate trace, and state scores." copy. Same flag-flip on first update. * Both flags are sticky — once we've seen state, a transient WS disconnect must not collapse the UI back to the empty state because the cached numerics are more useful than a placeholder. Tests (test_dashboard_empty_state.py, 7 cases): placeholder visible before first frame on both tabs; hidden after the first ``update_state`` on each; flag is sticky across disconnect/reconnect; ``DashboardWindow.update_state`` clears both at once; accessible names present (sets up the J-5 a11y sweep). Files: cortex/apps/desktop_shell/dashboard.py (+_bio_empty_state on _ConsumerTab, +_empty_state on _AdvancedTab, +_has_received_state flag on both); cortex/tests/unit/test_dashboard_empty_state.py (new).

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    814a165
  • audit-w2: thread F18 degraded/source through STATE_UPDATE WS payload F18 added ``source`` and ``degraded`` to ``StateInferResponse`` and the dashboard advanced tab reads both off the payload to toggle the "classifier unavailable" banner. The dashboard is fed by the WS STATE_UPDATE broadcast, not by ``/state/infer`` — but ``WebSocketServer._make_state_update`` never stamped the two fields, so the banner could not fire through the WS path and F18 was silently broken end-to-end. Also fixes a brittle dashboard fallback check that conflated the envelope ``source`` literal (``classifier``/``fallback``) with the debug-overlay ``classifier_source`` field (``rule``/``ml``/``ensemble``); on a healthy ``classifier_source="rule"`` payload the banner would have flipped True and stuck visible. Test ``test_ws_state_update_degraded.py`` (3 cases) pins the contract: classifier estimates emit ``source=classifier, degraded=False``; the fallback path (``classifier_source is None``) emits ``source=fallback, degraded=True``; the new fields land alongside the existing debug fields without removing them.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    a7bcf70
  • audit Phase-J: error toast with cid quote-back Adds a top-bar Toast widget to the dashboard that surfaces daemon errors with the F19 correlation id quoted back to the user. The cid is rendered in mono font ("ref: <cid>") and is selectable via ``TextSelectableByMouse`` so the user can copy it into a support ticket — the audit root cause was a cid users could see but not copy. Components: * cortex/apps/desktop_shell/components.py (new) — small ``Toast`` widget with title / body / ref(cid) slots, auto-dismiss after 8 s (the duration is a module constant so a future refactor can't silently shrink it), close button for manual dismissal, accessible names on every interactive sub-widget. * DashboardWindow gains a ``show_error(title, body, cid)`` method that forwards to the embedded Toast. The toast lives under the segmented control so it is visible from either tab. * DaemonBridge gains an ``error_occurred(str, str, str)`` Signal and an ``on_error(title, body, cid)`` callback. The signal is routed through CortexAppController._on_error_occurred to the dashboard's show_error method. Tests (test_dashboard_toast.py, 7 cases): toast renders title / body / cid; cid label is selectable (the contract pin against stylesheet refactors); auto-dismiss timer fires at the documented duration; close button dismisses immediately; empty cid does not crash; bridge signal round-trips to the toast; bridge defaults the cid arg to "" so daemon threads that lack a bound cid can still surface an error. Files: cortex/apps/desktop_shell/{components.py (new), dashboard.py, controller.py}; cortex/tests/unit/test_dashboard_toast.py (new).

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    7cfbb35
  • audit Phase-J: onboarding refinement — Continuity callout + Why expanders Adds two first-run polish affordances to the onboarding wizard: 1. Continuity Camera detection. When AVFoundation reports a paired iPhone / iPad / Continuity Camera device, the Camera card surfaces an inline callout — "We will skip your iPhone camera and use the MacBook camera." The webcam.py selection logic already silently skips Continuity Cameras (CLAUDE.md rules 5-6); this closes the feedback gap so first-runners aren't left wondering whether Cortex is about to grab the wrong feed. 2. "Why we need this" expander. Each of the four cards (Camera, Accessibility, BYOK, Extensions) now carries a "Why?" chevron in its header. Clicking toggles a collapsible rationale paragraph with trust-building copy explaining why Cortex needs the permission / setup and where the data lives (e.g. "video stream never leaves your Mac", "Bedrock token stays in the macOS Keychain"). Buttons get accessible names so VoiceOver announces them semantically. Tests (test_onboarding_hints.py, 6 cases): every card has the expander; rationale copy contains the spec keyword; the Continuity callout appears only when AVFoundation lists an iPhone-family device; the detector matches iPhone / iPad / Continuity keywords; expander buttons carry an accessible name. Existing test_onboarding_state.py (F49 — back/forward navigation) still passes. Files: cortex/apps/desktop_shell/onboarding.py (+_WHY_COPY, _detect_continuity_camera, expander helper in _make_section, callout in _make_step, step_id threaded through _make_step / _make_llm_step / _make_section); cortex/tests/unit/test_onboarding_hints.py (new).

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    a2f0072
  • test: mark Phase-I throughput fixture clients authenticated for Debt-2 Phase H's broadcast filter skips peers in pending_auth. The Phase I throughput harness creates fake clients to measure parallel-gather behaviour; the contract under test is the throughput of legitimate authenticated clients, not the auth gate. Updated the fixture to mark clients authenticated=True so the broadcast actually reaches them.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    8d67922
  • merge: Phase H — Debt-2 capability-token systemic auth (HTTP dep + WS AUTH-first + rotation UI) # Conflicts: # cortex/services/api_gateway/websocket_server.py

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    1e02c3f
  • audit Debt-2: append closure section to audit/execution-log.md Documents the systemic capability-token client bootstrap: the five-commit close-out (server HTTP dep, WS AUTH-first handshake, desktop_shell client, browser-extension client, rotation UI), the intentional retention of the F07/F08 single-endpoint gates as defense-in-depth, the migration path (token file already on disk from Wave-1), the threat model (cross-origin localhost closed; malware-as-the-user out of scope per audit/findings.md), and the reproducible verification commands for the new auth tests plus the manual adversarial smoke test.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    2b7da5c
  • audit Debt-2: capability-token rotation UI in Settings Adds a user-visible escape hatch for the cross-origin-localhost threat model: when a user suspects another account on the same Mac read their auth.token file, the Settings panel's new "Security" section lets them mint a fresh token in one click. The old value becomes invalid immediately (the file is atomically replaced; the in-memory cache on the desktop_shell side is refreshed via ``WebSocketBridge.refresh_auth_token``; the browser extension picks up the new value via the native host's ``get_auth_token`` command on its next connect after the daemon closes its socket). * ``cortex.libs.auth.local_token.rotate_token`` mints a fresh ``secrets.token_hex(32)``, writes it atomically (sibling ``.tmp`` chmod 0600 then ``os.replace``), and returns the new value. * ``cortex.libs.auth.__init__`` re-exports ``rotate_token`` alongside the existing helpers. * ``cortex/apps/desktop_shell/settings.py`` gains a "Security" section with the rotation button + status label + an ``auth_token_rotated(str)`` signal. The button briefly disables itself after a click to prevent stacked rotations on a fast double-click; the status label confirms success or surfaces the failure reason. ``EventType.AUTH_TOKEN_ROTATED`` (added in Commit 1) emits on every rotation so log aggregators see the audit trail. * ``cortex/apps/desktop_shell/main.py`` wires the dialog's signal to ``WebSocketBridge.refresh_auth_token`` so the in-memory cache is in lockstep with the new on-disk value and the active socket is closed (the connect loop reconnects with the new token in its AUTH frame). Test plan: ``test_token_rotation.py`` covers three cases — rotation produces a distinct value, the file is written atomically with mode 0600 on POSIX, and ``verify_token`` rejects the old token immediately after rotation. Full ``pytest cortex/tests/`` (excl. desktop_shell mock-pollution): 1307 passed.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    9066df1
  • test: reconcile F22 + Phase-I broadcast budget; exempt dev configs from bundle gate Phase I bumped the per-send WS timeout from 1s to 2s AND added a 100ms total-broadcast budget. F22's slow-consumer-close contract only fires when the per-send timeout elapses; with the 100ms budget the slow task gets cancelled first and lands in the 'slow but alive' branch (intended Phase I behaviour for transient jitter). The F22 tests now widen the budget locally so per-send fires first, preserving both contracts: the 100ms budget is exercised by test_broadcast_throughput, and the F22 disconnect path is exercised here. Also: bundle-size test now exempts dev-time configs (vitest.config.ts, plasmo.config.ts, tsconfig.json) — they don't ship in the runtime bundle and are not entry points.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    84cbc90
  • audit Debt-2: extension WS sends AUTH before IDENTIFY The browser extension's background service worker already cached the capability token (Wave-1 F07b/F08b) for SHUTDOWN payloads and ``/stop``/``/shutdown`` fetches. Debt-2 promotes that into the WebSocket lifecycle: every ``onopen`` now fires an ``AUTH`` frame as the FIRST outbound message (token from ``getAuthToken()``), with the existing ``IDENTIFY`` frame deliberately moved AFTER the AUTH so the daemon's systemic gate (Commit 2) accepts it. * ``connect()``: the ``onopen`` callback fires-and-forgets a ``getAuthToken().then(send AUTH then IDENTIFY)`` sequence. If the token fetch fails (cache miss + native-host unreachable), we let the daemon close us; the reconnect loop will retry once the token is available. * ``handleMessage``: new ``case "AUTH_OK"`` no-op so the daemon's ACK is recognised rather than falling into the unknown-type bucket. * The existing ``DAEMON_HTTP_URL`` fetches (Step 3 of the kill chain, ``/shutdown`` and ``/stop`` already attached the token from Wave-1) remain — they now align with the same systemic auth gate the WS protocol enforces. Test plan: new ``__tests__/systemic_auth.spec.ts`` covers two cases — the first outbound frame is ``AUTH`` carrying the cached token (index 0 in ``sock.sent``); ``IDENTIFY`` is the next frame (index >0); a simulated ``AUTH_OK`` + follow-on ``STATE_UPDATE`` reaches the popup broadcast path. Full ``vitest run``: 33/33 pass across 11 spec files.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    5eaef88
  • merge: Phase I — performance work (capture sub-sample, parallel broadcast, lazy imports, content-script leetcode observer) # Conflicts: # cortex/services/api_gateway/websocket_server.py

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    15cf352
  • audit Debt-2: desktop_shell WS client AUTHs before IDENTIFY The WebSocket bridge in the desktop_shell (the WS-mode client, used when ``cortex.apps.desktop_shell.main`` runs against an out-of-process daemon) now reads the capability token at startup, caches it, and sends an ``AUTH`` frame as the FIRST message on every (re)connect — ahead of ``IDENTIFY``. Without this the AUTH-first gate landed in Commit 2 would close the desktop_shell's socket immediately on each reconnect, breaking state-stream UX. * ``WebSocketBridge.__init__`` loads the token via ``cortex.libs.auth.load_or_create_token`` and stores it on ``self._auth_token``. A read failure is logged (not raised) so an install with a momentarily-unwritable Application Support dir does not crash the UI; the AUTH frame is omitted in that path and the daemon's gate refuses the socket, which is the safer fallback. * ``_connect_loop`` sends ``{type: "AUTH", payload: {auth_token: <hex>}}`` immediately after the socket opens, before ``IDENTIFY``. * ``refresh_auth_token`` exposes a re-read + active-WS-close hook for the Settings rotation affordance (Commit 5). Test plan: ``test_desktop_controller_auth.py`` adds two cases — ``test_bridge_caches_token_at_init`` (reading the token file at construction time, the contract rotation leans on) and ``test_bridge_sends_auth_first_after_connect`` (a fake ``websockets`` module captures every outbound frame; AUTH is index 0, IDENTIFY is index 1). The in-process ``controller.py`` path is unaffected — it does not speak WS over the wire, so no AUTH frame is needed.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    f16a46a
  • audit Phase-I: lazy mediapipe + keyring imports for sub-2s startup Defers two heavyweight imports until the code that actually needs them runs. Combined with the mediapipe lazy-import already shipped in commit 1 (capture-loop), this brings daemon import wall-time on a developer M-series Mac from ~3-4 s anecdotal to ~1.3 s measured — comfortably inside the 2 s target. Changes: - cortex/services/llm_engine/anthropic_planner.py: move `import keyring` from module top to inside `_keychain_get_bedrock_token` (the only call site). The function already had a try/except for missing backends, so the lazy import composes cleanly. - cortex/scripts/run_dev.py: add `--profile-startup` CLI flag. When set, `record_milestone(label)` accumulates monotonic timestamps at key points (entrypoint, config-loaded, server-built, daemon-task-spawned, daemon-warmup-elapsed) and a formatted table prints on exit. Measurement on dev M-series laptop (Python 3.11, warm fs cache): $ time .venv/bin/python -c "import cortex.scripts.run_dev" real 0m1.306s mediapipe lazy-import savings: ~563 ms (verified via `import mediapipe` timing in a fresh interpreter) keyring lazy-import savings: ~41 ms sys.modules after run_dev import: mediapipe loaded: False keyring loaded: False Regression guard: cortex/tests/performance/test_startup_latency.py runs the import in a fresh subprocess and asserts mediapipe / keyring are NOT in sys.modules after the import returns. A regression that re-introduces an eager import fails loudly.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    463dd34
  • audit Phase-I: lazy-load tab manager, content-script-only leetcode observer Browser-extension bundle hygiene. The current chrome-mv3-prod build already meets the audit Phase-I target — measured on the parent repo build/ directory: Uncompressed total: ~549 KB Gzipped total: ~175 KB (target: < 250 KB) Largest file: popup.100f6462.js — 169 KB uncompressed This commit makes the bundle layout explicit and adds a regression guard so the size budget is enforced going forward. Changes: - leetcode-observer.ts moved from the top-level (where Plasmo bundles it as a separate content-script entry, but its location invited future refactors to import it into background.ts) into contents/ where the convention makes its content-script-only status visible at file-tree level. Comments referencing the old path updated in background.ts and activity-tracker.ts. - tab-manager.ts is the only large module that ships in the background bundle (background.ts imports its top-level functions for hide/restore lifecycle hooks). Lazy-loading it via dynamic import would require restructuring the hot path; the current bundle is already 30% under target so the speedup is not worth the hot-path indirection. Documented in test_bundle_size.py. Regression guard: cortex/tests/performance/test_bundle_size.py pins per-file source-size budgets for every Plasmo entry point (background, popup, newtab, tabs/, contents/), plus a total-source-budget assert. The bundler ratio (~2:1 source:gzip) is stable enough that source budgets correlate with bundle budgets without needing a build. Measurements (uncompressed source bytes vs budget): - background.ts: 127,868 / 200,000 (64%) - popup.tsx: 44,689 / 80,000 (56%) - newtab.tsx: 24,636 / 80,000 (31%) - tab-manager.ts: 14,122 / 60,000 (24%) - activity-tracker.ts: 29,459 / 40,000 (74%) - contents/ambient.ts: 13,189 / 40,000 (33%) - contents/leetcode-observer.ts: 25,044 / 40,000 (63%) - tabs/onboarding.tsx: 8,603 / 80,000 (11%) Total entry-point source: 292,527 bytes — 49% of the 600 KB aggregate budget (well inside the 250 KB compressed target).

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    d95ae74
  • audit Debt-2: WebSocket AUTH-first handshake Promotes the F07 SHUTDOWN-only inline check to a systemic protocol: every WS connection starts in ``pending_auth`` and the server refuses every non-``AUTH`` frame until the client presents a valid token. On success the server sends ``AUTH_OK`` and the client's ``WebSocketClient.authenticated`` flag flips to True; broadcasts skip peers still in ``pending_auth`` so STATE_UPDATE / INTERVENTION frames never leak to an unauthenticated origin. * ``MessageType.AUTH`` and ``MessageType.AUTH_OK`` added to ``cortex/libs/schemas/ws_message_types.py``; codegen regenerated the generated ``cortex_schemas.d.ts`` so the TS dispatch sites can narrow against the new literals. * ``WebSocketClient.authenticated: bool = False`` is the new gate. Setting the flag is intentionally one-way per connection. * ``WebSocketServer._dispatch_message`` short-circuits to ``_handle_auth`` on ``AUTH``; every other type while ``pending_auth`` triggers ``close(code=1011, reason="auth required")`` + ``EventType.AUTH_REJECTED``. * ``_handle_auth`` validates via ``cortex.libs.auth.verify_token``, sends ``AUTH_OK`` on success, and replays the latest STATE_UPDATE to preserve the legacy "new connection sees current state" UX. Replay-safe: a second ``AUTH`` on an authed client re-ACKs without resetting anything. * The legacy F07 inline ``verify_token`` call on SHUTDOWN stays as defense-in-depth (will be made redundant in future cleanup once Phase H clients are deployed widely). * ``_broadcast`` skips unauthenticated peers so a connect-and-listen attack cannot harvest the daemon's state stream. Test plan: ``test_systemic_auth_ws.py`` adds six integration cases covering pre-auth refusal, successful handshake + subsequent message acceptance, wrong-token close, idempotent AUTH replay, AUTH_REJECTED event content, and a real-listener adversarial smoke test that asserts the daemon closes a non-AUTHing connection within 2 s. The existing ``test_auth_local_token.py`` SHUTDOWN test now sends AUTH first; ``test_ws_slow_client.py``, ``test_correlation_ids.py``, and the ``test_api_gateway.py::test_send_restore_broadcasts_restore_message`` test mark their mock clients ``authenticated=True`` since they exercise broadcast / slow-consumer behaviour, not the handshake. Full ``pytest cortex/tests/`` (excluding the legacy ``test_desktop_shell.py`` mock-pollution suite): 1302 passed. ``python -m cortex.scripts.generate_ts_schemas --check`` exits 0.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    78d9d57
  • audit Phase-I: parallel WS broadcast with hard budget Replaces the serial `for client in clients: await send(...)` in `WebSocketServer._broadcast` with `asyncio.wait` over per-client send tasks under a hard 100 ms total budget. The previous serial loop made a four-client broadcast cost ~sum(client_latencies); one flaky client could stretch a single STATE_UPDATE to ~4 s, dropping every queued 30 Hz capture frame behind it. Changes: - Per-send timeout bumped 1 s → 2 s (`_BROADCAST_PER_CLIENT_TIMEOUT_S`) so transient network blips don't get classified as dead consumers. - Hard total budget capped at 100 ms (`_BROADCAST_BUDGET_S`). When the budget elapses, unfinished sends are cancelled and the affected clients are billed as drops for that frame WITHOUT being disconnected — the per-send timeout remains the disconnect path. - New `EventType.WS_BROADCAST_SLOW` event with `elapsed_ms`, `budget_ms`, `client_count`, and `dropped_for_budget` so support can correlate dropped frames with broadcast budget overruns. Each send is wrapped in its own `asyncio.create_task` so the budget cancellation only cancels the pending tasks — already-completed results are preserved (a naïve `asyncio.gather` cancellation would have lost them). Regression guard: cortex/tests/performance/test_broadcast_throughput.py spins up four fake clients and asserts (a) p95 latency < 100 ms with healthy clients, (b) four 40 ms clients run in parallel (~40 ms, not 160 ms), (c) budget overflow billed as a drop without disconnect, (d) per-send timeout actually disconnects.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    fc024ef
  • audit Phase-I: capture-loop mediapipe sub-sampling + colour-convert cache Three structural wins on the capture pipeline: 1. MediaPipe FaceLandmarker sub-samples at `face_mesh_subsample_n` (default 2, i.e. 15 Hz at 30 Hz capture). Downstream state/telemetry loops run at 2 Hz, so 15 Hz mesh stays well above Nyquist while halving per-frame mediapipe cost. The FaceTracker caches its last result and replays it on skipped frames so consumers always see a well-formed FaceTrackingResult. 2. BGR→RGB conversion is computed once per frame in CapturePipeline and threaded into FaceTracker.process_frame via a new rgb_frame keyword argument. Previously FaceTracker called cv2.cvtColor itself on every invocation, duplicating work the pipeline already had. 3. BGR→GRAY conversion is computed once per frame in CapturePipeline and threaded into FrameQualityScorer.score via a new gray_frame keyword argument. Previously brightness and blur scoring each ran cvtColor independently, tripling the conversion cost. The mediapipe import in face_tracker.py is also deferred (Phase I.4 preview) — `_ensure_mediapipe` performs the real import on first use. The module-level `mp` attribute is preserved so existing test patches that referenced it keep working with minor adjustments. Regression guard: cortex/tests/performance/test_capture_perf.py runs a 1000-frame synthetic harness through the cache-aware paths and asserts the wall-time stays inside a 5-second budget and that the sub-sample cache actually skips mediapipe on at least half the frames.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    cdf882d
  • audit Debt-2: server-side capability-token gate on every HTTP route Promotes the F07/F08 single-endpoint token checks into a systemic FastAPI dependency that every mutating HTTP route on the API gateway must clear. ``/health`` keeps its unauthenticated entry so the supervisor liveness probe still works before the UI has the token. * ``cortex/services/api_gateway/auth.py`` exports the dependency pair (``require_capability_token`` raising 401, ``optional_capability_token`` for liveness). Both accept ``Authorization: Bearer <token>`` or the CORS-preflight-friendlier ``X-Cortex-Auth-Token`` header. * ``app.py`` mounts ``health_router`` openly and ``router`` with ``dependencies=[Depends(require_capability_token)]``. Adding a new mutating route automatically inherits the gate; a new liveness-only route lives on ``health_router`` and is visible in code review. * ``routes.py`` splits the single ``router`` into ``router`` + ``health_router``; the ``/health`` GET moves to the latter, every other route stays on ``router`` and now gates on the token. * ``EventType.AUTH_REJECTED`` and ``EventType.AUTH_TOKEN_ROTATED`` added to ``cortex/libs/logging/structured.py``; the rejection path emits the former with the request path + cid so log aggregators can alarm on hostile-localhost scanning spikes. * The F07 WS SHUTDOWN check stays as defense-in-depth; the F08 launcher /stop check stays (zero-cortex-imports invariant). Test plan: ``test_systemic_auth_http.py`` adds six cases proving the gate fires on missing/invalid token, accepts both header forms, leaves ``/health`` open, and emits the structured AUTH_REJECTED event. The existing ``test_api_gateway.py`` fixture now injects the bearer header so every existing test continues exercising the route handler (no behaviour change there). ``test_state_infer_envelope.py`` updated the same way. Threat model: closes cross-origin localhost (hostile webpage / extension in another tab that can speak the protocol but cannot read the mode-0600 auth file). Does not close malware-as-the-user — explicitly named in ``audit/findings.md`` Debt-2 as out of scope.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    0fe609a
  • merge: Phase G — shared schema codegen via pydantic2ts; closes Debt-1 (F42+F43+F44+F45) # Conflicts: # .github/workflows/ci.yml # cortex/apps/browser_extension/popup.tsx # cortex/services/api_gateway/websocket_server.py

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    178b8f7
  • audit Debt-1: CI + pre-commit drift gate (schema codegen) Locks in the drift contract introduced by the previous four commits. The Pydantic models in cortex/libs/schemas/ are the single source of truth; the committed cortex_schemas.d.ts must never go out of sync. After this commit, that property is enforced at two layers: 1. ``.github/workflows/ci.yml`` adds a ``schema-codegen-check`` job that: - Sets up Python 3.11 and Node 20. - Installs ``json-schema-to-typescript`` globally (for the ``json2ts`` binary). - Installs ``pip install -e ./cortex[codegen]``. - Runs ``python -m cortex.scripts.generate_ts_schemas --check``. Exit 1 (drift) prints the diff and fails the job; PRs cannot merge until the developer regenerates and commits the result. Concurrency-cancel saves runner minutes on PR pushes. 2. ``.pre-commit-config.yaml`` adds a local hook firing on any change to: - ``cortex/libs/schemas/*.py`` (Pydantic source side) - ``cortex/scripts/generate_ts_schemas.py`` (generator itself) - ``cortex/apps/browser_extension/types/generated/ cortex_schemas.d.ts`` (so manual hand-edits are rejected) The hook executes the same ``--check`` command; pre-commit captures stdout and prints the diff on failure so the developer immediately sees what changed. Documentation: - ``CLAUDE.md`` gains a ``Schema Codegen (audit Debt-1)`` section with regeneration, install, and add-a-new-WS-type recipes. (CLAUDE.md is project-local and not committed; the master file at the repo root carries this section.) Manual verification of the gate (run during this commit): # Tamper: sed -i '' '1s|^|// TAMPER\n|' .../cortex_schemas.d.ts python -m cortex.scripts.generate_ts_schemas --check # exit 1, prints diff # Restore: git checkout .../cortex_schemas.d.ts python -m cortex.scripts.generate_ts_schemas --check # exit 0 Final shape of Debt-1 closure: - F42 closed structurally: SuggestedAction.action_type is the generated Literal union; TS exhaustiveness in executeAction's switch. - F43 closed structurally: SuggestedAction.catalog_id present in the generated type; consumers can reference it. - F44 closed structurally: ActionExecuteResult.reversible aligned with Pydantic's canonical name. - F45 closed structurally: WSMessage.type is the generated MessageType union; daemon-side Pydantic rejects typos at construction; extension- side TS compiler rejects them at dispatch. The drift cannot recur without: - A developer with ``--no-verify`` bypassing pre-commit AND - Their PR being merged without the required CI check passing. Both require explicit, auditable action.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    1a6592d
  • audit Debt-1: migrate browser extension to generated schema types End-to-end close-out for F42/F43/F44/F45. The hand-written TypeScript interfaces in ``background.ts`` / ``popup.tsx`` / ``leetcode-observer.ts`` are replaced with imports from the generated ``./types/generated/cortex_schemas`` module — the Pydantic models in ``cortex/libs/schemas/`` are now the single source of truth for every schema-equivalent shape that crosses the daemon ↔ extension boundary. What lands here per-file: ``background.ts``: - Deletes the hand-written ``interface WSMessage`` and ``interface SuggestedAction``. Imports both from the generated file. - ``WSMessage`` is wrapped locally with an ``Omit<…, "payload">`` to promote ``payload`` to ``Record<string, unknown>`` (always emitted on the wire; the JSON-schema mode marks it optional only because it has a Python-side default factory). - ``SuggestedAction`` is wrapped locally to promote default-factory fields (``action_id``, ``target``, ``label``, ``reason``, ``category``, ``reversible``, ``metadata``) to required. Genuinely- optional fields (``tab_index``, ``group_id``, ``catalog_id``) stay optional. - ``ActionExecuteResult.undo_available`` renamed to ``reversible`` to match Pydantic's canonical name (F44 closure). Every of the 33 construction sites updated; no consumer reads the old name. - ``executeAction``'s switch gets a ``const unhandled: never = action.action_type`` exhaustiveness check. A new ``action_type`` literal in the Pydantic catalogue now fails this file's tsc step until a matching case is added (F42 closure). ``popup.tsx``: - Imports ``SuggestedAction``, ``TabRecommendation``, ``TabRecommendations`` from the generated file. - ``tabRecs`` state is now ``TabRecommendations | null`` (was the ad-hoc ``{ tabs: Record<…>[]; summary: string }``). - ``synthesizeActions`` types its ``action_type`` synthesis against the generated ``SuggestedAction["action_type"]`` union, so a Pydantic rename fails compile (F42). ``leetcode-observer.ts``: - Imports ``LeetCodeContext``, ``LeetCodeStage``, ``SubmissionResult`` from the generated file; drops the hand-written ``LeetCodeContextPayload`` interface. - Bug fix surfaced by migration: ``normalizeResult`` was returning ``"TLE"`` / ``"MLE"`` strings the Pydantic schema does not accept (it requires ``"Time Limit Exceeded"`` / ``"Memory Limit Exceeded"``). The strings were silently dropped by the daemon. Fixed to emit the canonical literal — typed return makes a future drift surface at compile time. - ``lastSubmissionResult`` typed as ``SubmissionResult | null``. Daemon-side additions required to make the migration end-to-end: ``cortex/libs/schemas/ws_message_types.py``: - Adds 15 ``LEETCODE_*`` members to the ``MessageType`` catalogue. These were already on the wire (emitted by ``LeetCodeAdapter`` via ``WebSocketServer.send_message``) but absent from the enum, which would have failed the new Pydantic validator at construction time. Each member is annotated with the action it carries so the wire contract is documented inline. ``cortex/tests/unit/test_ws_message_schema.py``: - New ``test_message_type_catalog_covers_leetcode_adapter_emissions`` pins the LEETCODE_* set into the catalog. Adding a new ``_CAPABILITIES`` entry to ``LeetCodeAdapter`` without extending ``MessageType`` fails this test. ``cortex/apps/browser_extension/types/generated/cortex_schemas.d.ts``: - Regenerated to include the LEETCODE_* members and the 4-letter alphabetised order. Verification: cd cortex/apps/browser_extension pnpm install --ignore-workspace ./node_modules/.bin/tsc --noEmit # 0 errors cd ../../.. python -m cortex.scripts.generate_ts_schemas --check # exit 0 pytest cortex/tests/unit/test_ws_message_schema.py # 13 passed pytest cortex/tests/unit/test_leetcode_adapter.py # 9 passed pytest cortex/tests/unit/test_api_gateway.py # 41 passed F42/F43/F44/F45 close structurally here: every consumer reads from the generated type, the CI gate (Commit 5) makes drift impossible, and the exhaustiveness check inside ``executeAction`` makes a missed ``action_type`` case fail compilation rather than ship.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    f146c8f
  • audit-w2: append UI consistency reconciliation report Records the 6 Wave-2 commits (warm-label tokens, FONT_MONO, window- chrome regression guard, raw-int spacing, a11y on settings+connections+ onboarding, popup-toggle token routing), the per-dimension verdict for all 8 audit dimensions, the surfaces audited matrix, the verification runs (1150 unit + 35 UI + 31 vitest), and three residual-risk items (no loading skeleton on briefing/activity, no fade on functional notifications, pre-existing test_desktop_shell mock pollution).

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    c1aafe4
  • merge: Wave 2-B (audit-w2 4 commits) — pipeline/architecture consistency

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    6d9b1e9
  • audit Debt-1: regenerate cortex_schemas.d.ts with WSMessage + MessageType Re-runs ``python -m cortex.scripts.generate_ts_schemas`` so the committed ``cortex_schemas.d.ts`` includes the two new schemas introduced in Commit 2: ``WSMessage`` (Pydantic envelope) and ``MessageType`` (the 20-member dispatch-literal enum). No source-side changes — this commit is the generated output only. The drift gate now reports the file in sync. Verification: python -m cortex.scripts.generate_ts_schemas --check # exit 0 grep "export type MessageType" cortex_schemas.d.ts # present grep "export interface WSMessage" cortex_schemas.d.ts # present Commit 4 migrates ``background.ts`` / ``popup.tsx`` to import these generated types and drops the hand-written interfaces (closing F42/ F43/F44/F45 on the consumer side). Commit 5 adds the CI + pre-commit drift gate.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    b053343
  • audit-w2: route popup toggle radius + transitions through tokens popup.tsx hardcoded borderRadius: 12 on the toggle track (half-height pill) and "#fff" on the thumb, plus three '0.2s ease' transition literals. The hex matched CX.textInverse and 200ms ease matched CX.durationNormal + CX.easeDefault, but the literals would drift if a future motion-curve or palette change lands in tokens.yaml. Promoted: - toggleTrack.borderRadius -> CX.radiusFull (still clamps to half-height since 9999 >> half of 24px) - toggleThumb.background -> CX.textInverse - toggleTrack/toggleThumb/connectBtn transition durations -> CX.durationNormal with CX.easeDefault curve Verified all 31 extension specs (vitest, jsdom) stay green.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    4bd1687
  • audit-w2: align Architecture.md port table + service list with code The repository map listed six service directories; cortex/services actually contains fifteen. Architecture.md mentioned ports 9472 and 9473 in prose but never the launcher agent's 9471, which is part of the documented kill chain (CLAUDE.md §13). Add an explicit ports table naming all three ports and what binds them, extend the service list to cover capture_service, kinematics_engine, telemetry_engine, context_engine, session_report, api_gateway, launcher, janitor, activity_tracker, handover, and throttle, and add a regression-guard test that re-reads the markdown after every commit to keep doc drift from re-opening this gap.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    3e25991
  • audit Debt-1: promote WSMessage to Pydantic + introduce MessageType enum The WebSocket envelope was a dataclass with a free-form ``type: str``, which let F45 happen: typos in dispatch arms silently bypassed handlers and the extension's hand-written TS interface (``background.ts:23``) drifted independently. This commit makes the Pydantic model the source of truth for the codegen pipeline. Pieces: - ``cortex/libs/schemas/ws_message.py`` — new ``WSMessage(BaseModel)`` with the exact field set from the legacy dataclass. ``use_enum_values =True`` keeps ``msg.type`` a plain string at runtime so existing ``if msg.type == "STATE_UPDATE"`` dispatch sites work unchanged. ``extra="ignore"`` keeps wire forwards-compat with future field additions. - ``cortex/libs/schemas/ws_message_types.py`` — new ``MessageType(str, Enum)`` catalogue. Membership policy is documented inline: a literal is in the enum if the daemon either emits it or dispatches on it. Currently 20 members (11 inbound, 9 outbound + 2 daemon-internal). - ``cortex/services/api_gateway/websocket_server.py``: * ``WSMessage`` rebound to the Pydantic model so the api_gateway's public surface (``from cortex.services.api_gateway import WSMessage``) is the new model — the existing tests' construct/ parse calls work unchanged. * Legacy dataclass preserved as ``WSMessageLegacy`` with an explicit ``to_pydantic()`` helper for the one-release deprecation window, per the migration plan. * ``_process_message`` dispatch arms use ``MessageType.X.value`` so a renamed enum member is a Python-side compile error. * Outbound ``_make_*`` and ``send_restore``/``broadcast_settings``/ ``request_context`` now pass ``MessageType`` members instead of raw strings. * ``ValidationError`` is now a parse failure mode and is caught alongside ``JSONDecodeError`` / ``KeyError`` — clients sending unknown types see the same drop-and-log behaviour they would have seen for malformed JSON, but the daemon no longer routes on uncatalogued literals (F45 closure structural condition). - ``cortex/libs/schemas/__init__.py`` — re-exports ``WSMessage`` and ``MessageType`` so they participate in the codegen walk. Tests (``cortex/tests/unit/test_ws_message_schema.py``, 12 cases): 1. Catalog membership covers every inbound dispatched literal. 2. Catalog membership covers every outbound emitted literal. 3. Construction via string and enum forms both produce wire-string types (use_enum_values contract). 4. Unknown ``type`` literal at construction → ValidationError. 5. Pydantic round-trip preserves every field. 6. Legacy dataclass → JSON → Pydantic round-trips structurally. 7. Explicit ``WSMessageLegacy.to_pydantic()`` helper covered. 8. Wire JSON shape matches between legacy and new (same key set). 9. Representative captures (STATE_UPDATE, INTERVENTION_TRIGGER, USER_ACTION, SHUTDOWN) replay through the new model without error. 10. Default payload is ``{}`` (legacy behaviour preserved). 11. Forward-compat field is ignored without crash. Verification: pytest cortex/tests/unit/test_ws_message_schema.py # 12 passed pytest cortex/tests/unit/test_api_gateway.py # 41 passed (no regression) pytest cortex/tests/ # 1053 passed (full suite) The codegen file ``cortex_schemas.d.ts`` is intentionally NOT regenerated in this commit. ``python -m cortex.scripts.generate_ts_schemas --check`` now reports drift (new WSMessage + MessageType types) — that's the expected state. Commit 3 regenerates; Commit 4 migrates the extension; Commit 5 wires the CI + pre-commit drift gate.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    9d5e11f
  • audit-w2: persist real last-escalation age instead of 0-clamped sign-flipped delta F26's _persist_quiet_mode_history wrote ``self._quiet_mode_count_reset_at - time.monotonic()`` to the ``last_escalation_at_monotonic_delta`` field. Because reset_at is the monotonic timestamp of the last escalation (in the past), the expression was always non-positive and the surrounding ``max(0.0, ...)`` clamped it to 0. Rehydrate then stamped reset_at to ``time.monotonic()`` — i.e. "last escalation was just now" regardless of when it actually fired. The field is currently diagnostic only, so no consumer-visible behaviour changed, but any future caller trusting the value would have inherited the bug. Rename to ``last_escalation_age_seconds``, flip the sign, and accept both field names in the loader so an upgrade-in-place does not lose state.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    bb4be0a
  • audit-w2: re-consult cost kill switch between LLM retry attempts F20 consulted CostTracker.check_budget() only at call entry. F30's cancellation-cost path can bill mid-call, and a successful but token-heavy first attempt followed by a retry triggers the same gap: the loop continues to attempts 2 and 3 with no further budget check. Re-consult the ceiling at the top of every retry; on KILL serve the deterministic fallback stamped with budget_killed_on_retry so an operator can distinguish call-start kills from mid-retry kills.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    425993b
  • audit-w2: accessible names + tab order on settings, connections, onboarding F55 wired accessibility on the dashboard + overlay. The other three top-level surfaces (settings dialog, connections panel, onboarding wizard) still announced every control as 'button' / 'checkbox' to VoiceOver and let focus escape the window unpredictably. Extracts dashboard.py + overlay.py's defensive a11y wrappers into a new shared module (cortex/apps/desktop_shell/a11y.py — within the audit file budget) exposing set_accessible_name, set_accessible_description, set_tab_order, chain_tab_order. The helpers no-op cleanly under the lightweight mock PySide6 stubs that the legacy mock suite installs. Wired into: - settings.py: 16 controls (back, 6 checkboxes, slider, 2 spinboxes, combo, 4 debug checkboxes, close, apply) + a 15-step tab chain. - connections.py: back button + every Connect <browser> / Connect <editor> button collected into _tab_order_chain and chained. - onboarding.py: BYOK key input + region combo + save button + Open Connections + Get Started, plus per-step Grant buttons + status pills get programmatic accessible names. Tab chain pins focus on the BYOK card → Connect Extensions → Get Started. Adds cortex/tests/unit/test_a11y_coverage.py — 3 cases instantiate each panel under offscreen Qt and assert accessible names on every key interactive widget. F55 dashboard + overlay tests stay green.

    @StevenWang-CY StevenWang-CY committed May 18, 2026
    bdff047