audit Phase-J: empty states for biometrics + advanced tab
Before the first capture frame arrives, both the consumer biometrics
card and the advanced developer tab now render "Start a session to ..."
placeholders. Without this, the BPM / HRV / BLK numerics read as "--"
and the signal-quality bars sit at zero — both states are easily
mis-read as "Cortex is stuck" rather than "Cortex hasn't started yet".
The audit-w2 reconciliation noted the timeline panel already had a "No
events yet" empty state but the BPM card and the dev-debug widgets did
not. Phase J-3 closes that gap on both surfaces.
Contract:
* ``_ConsumerTab._bio_empty_state`` (QLabel) inside the biometrics
card carries "Start a session to see your biometrics." Hidden the
moment ``update_state`` is called for the first time.
* ``_AdvancedTab._empty_state`` carries the longer "Start a session
to populate signal quality, heart-rate trace, and state scores."
copy. Same flag-flip on first update.
* Both flags are sticky — once we've seen state, a transient WS
disconnect must not collapse the UI back to the empty state
because the cached numerics are more useful than a placeholder.
Tests (test_dashboard_empty_state.py, 7 cases): placeholder visible
before first frame on both tabs; hidden after the first
``update_state`` on each; flag is sticky across disconnect/reconnect;
``DashboardWindow.update_state`` clears both at once; accessible names
present (sets up the J-5 a11y sweep).
Files: cortex/apps/desktop_shell/dashboard.py (+_bio_empty_state on
_ConsumerTab, +_empty_state on _AdvancedTab, +_has_received_state
flag on both); cortex/tests/unit/test_dashboard_empty_state.py (new).
814a165
audit-w2: thread F18 degraded/source through STATE_UPDATE WS payload
F18 added ``source`` and ``degraded`` to ``StateInferResponse`` and the
dashboard advanced tab reads both off the payload to toggle the
"classifier unavailable" banner. The dashboard is fed by the WS
STATE_UPDATE broadcast, not by ``/state/infer`` — but
``WebSocketServer._make_state_update`` never stamped the two fields, so
the banner could not fire through the WS path and F18 was silently
broken end-to-end.
Also fixes a brittle dashboard fallback check that conflated the
envelope ``source`` literal (``classifier``/``fallback``) with the
debug-overlay ``classifier_source`` field (``rule``/``ml``/``ensemble``);
on a healthy ``classifier_source="rule"`` payload the banner would have
flipped True and stuck visible.
Test ``test_ws_state_update_degraded.py`` (3 cases) pins the contract:
classifier estimates emit ``source=classifier, degraded=False``; the
fallback path (``classifier_source is None``) emits
``source=fallback, degraded=True``; the new fields land alongside the
existing debug fields without removing them.
a7bcf70
audit Phase-J: error toast with cid quote-back
Adds a top-bar Toast widget to the dashboard that surfaces daemon
errors with the F19 correlation id quoted back to the user. The cid
is rendered in mono font ("ref: <cid>") and is selectable via
``TextSelectableByMouse`` so the user can copy it into a support
ticket — the audit root cause was a cid users could see but not copy.
Components:
* cortex/apps/desktop_shell/components.py (new) — small ``Toast`` widget
with title / body / ref(cid) slots, auto-dismiss after 8 s (the
duration is a module constant so a future refactor can't silently
shrink it), close button for manual dismissal, accessible names on
every interactive sub-widget.
* DashboardWindow gains a ``show_error(title, body, cid)`` method that
forwards to the embedded Toast. The toast lives under the segmented
control so it is visible from either tab.
* DaemonBridge gains an ``error_occurred(str, str, str)`` Signal and
an ``on_error(title, body, cid)`` callback. The signal is routed
through CortexAppController._on_error_occurred to the dashboard's
show_error method.
Tests (test_dashboard_toast.py, 7 cases): toast renders title / body /
cid; cid label is selectable (the contract pin against stylesheet
refactors); auto-dismiss timer fires at the documented duration; close
button dismisses immediately; empty cid does not crash; bridge signal
round-trips to the toast; bridge defaults the cid arg to "" so daemon
threads that lack a bound cid can still surface an error.
Files: cortex/apps/desktop_shell/{components.py (new), dashboard.py,
controller.py}; cortex/tests/unit/test_dashboard_toast.py (new).
7cfbb35
audit Phase-J: onboarding refinement — Continuity callout + Why expanders
Adds two first-run polish affordances to the onboarding wizard:
1. Continuity Camera detection. When AVFoundation reports a paired
iPhone / iPad / Continuity Camera device, the Camera card surfaces
an inline callout — "We will skip your iPhone camera and use the
MacBook camera." The webcam.py selection logic already silently
skips Continuity Cameras (CLAUDE.md rules 5-6); this closes the
feedback gap so first-runners aren't left wondering whether Cortex
is about to grab the wrong feed.
2. "Why we need this" expander. Each of the four cards (Camera,
Accessibility, BYOK, Extensions) now carries a "Why?" chevron in
its header. Clicking toggles a collapsible rationale paragraph
with trust-building copy explaining why Cortex needs the
permission / setup and where the data lives (e.g. "video stream
never leaves your Mac", "Bedrock token stays in the macOS
Keychain"). Buttons get accessible names so VoiceOver announces
them semantically.
Tests (test_onboarding_hints.py, 6 cases): every card has the
expander; rationale copy contains the spec keyword; the Continuity
callout appears only when AVFoundation lists an iPhone-family device;
the detector matches iPhone / iPad / Continuity keywords; expander
buttons carry an accessible name. Existing test_onboarding_state.py
(F49 — back/forward navigation) still passes.
Files: cortex/apps/desktop_shell/onboarding.py (+_WHY_COPY,
_detect_continuity_camera, expander helper in _make_section, callout
in _make_step, step_id threaded through _make_step / _make_llm_step
/ _make_section); cortex/tests/unit/test_onboarding_hints.py (new).
a2f0072
test: mark Phase-I throughput fixture clients authenticated for Debt-2
Phase H's broadcast filter skips peers in pending_auth. The Phase I
throughput harness creates fake clients to measure parallel-gather
behaviour; the contract under test is the throughput of legitimate
authenticated clients, not the auth gate. Updated the fixture to mark
clients authenticated=True so the broadcast actually reaches them.
8d67922
merge: Phase H — Debt-2 capability-token systemic auth (HTTP dep + WS AUTH-first + rotation UI)
# Conflicts:
# cortex/services/api_gateway/websocket_server.py
1e02c3f
audit Debt-2: append closure section to audit/execution-log.md
Documents the systemic capability-token client bootstrap: the
five-commit close-out (server HTTP dep, WS AUTH-first handshake,
desktop_shell client, browser-extension client, rotation UI), the
intentional retention of the F07/F08 single-endpoint gates as
defense-in-depth, the migration path (token file already on disk
from Wave-1), the threat model (cross-origin localhost closed;
malware-as-the-user out of scope per audit/findings.md), and the
reproducible verification commands for the new auth tests plus
the manual adversarial smoke test.
2b7da5c
audit Debt-2: capability-token rotation UI in Settings
Adds a user-visible escape hatch for the cross-origin-localhost
threat model: when a user suspects another account on the same Mac
read their auth.token file, the Settings panel's new "Security"
section lets them mint a fresh token in one click. The old value
becomes invalid immediately (the file is atomically replaced; the
in-memory cache on the desktop_shell side is refreshed via
``WebSocketBridge.refresh_auth_token``; the browser extension picks
up the new value via the native host's ``get_auth_token`` command
on its next connect after the daemon closes its socket).
* ``cortex.libs.auth.local_token.rotate_token`` mints a fresh
``secrets.token_hex(32)``, writes it atomically (sibling ``.tmp``
chmod 0600 then ``os.replace``), and returns the new value.
* ``cortex.libs.auth.__init__`` re-exports ``rotate_token`` alongside
the existing helpers.
* ``cortex/apps/desktop_shell/settings.py`` gains a "Security"
section with the rotation button + status label + an
``auth_token_rotated(str)`` signal. The button briefly disables
itself after a click to prevent stacked rotations on a fast
double-click; the status label confirms success or surfaces the
failure reason. ``EventType.AUTH_TOKEN_ROTATED`` (added in
Commit 1) emits on every rotation so log aggregators see the
audit trail.
* ``cortex/apps/desktop_shell/main.py`` wires the dialog's signal to
``WebSocketBridge.refresh_auth_token`` so the in-memory cache is
in lockstep with the new on-disk value and the active socket is
closed (the connect loop reconnects with the new token in its
AUTH frame).
Test plan: ``test_token_rotation.py`` covers three cases —
rotation produces a distinct value, the file is written atomically
with mode 0600 on POSIX, and ``verify_token`` rejects the old token
immediately after rotation. Full ``pytest cortex/tests/`` (excl.
desktop_shell mock-pollution): 1307 passed.
9066df1
test: reconcile F22 + Phase-I broadcast budget; exempt dev configs from bundle gate
Phase I bumped the per-send WS timeout from 1s to 2s AND added a 100ms
total-broadcast budget. F22's slow-consumer-close contract only fires
when the per-send timeout elapses; with the 100ms budget the slow task
gets cancelled first and lands in the 'slow but alive' branch
(intended Phase I behaviour for transient jitter). The F22 tests now
widen the budget locally so per-send fires first, preserving both
contracts: the 100ms budget is exercised by test_broadcast_throughput,
and the F22 disconnect path is exercised here.
Also: bundle-size test now exempts dev-time configs (vitest.config.ts,
plasmo.config.ts, tsconfig.json) — they don't ship in the runtime
bundle and are not entry points.
84cbc90
audit Debt-2: extension WS sends AUTH before IDENTIFY
The browser extension's background service worker already cached the
capability token (Wave-1 F07b/F08b) for SHUTDOWN payloads and
``/stop``/``/shutdown`` fetches. Debt-2 promotes that into the
WebSocket lifecycle: every ``onopen`` now fires an ``AUTH`` frame as
the FIRST outbound message (token from ``getAuthToken()``), with the
existing ``IDENTIFY`` frame deliberately moved AFTER the AUTH so the
daemon's systemic gate (Commit 2) accepts it.
* ``connect()``: the ``onopen`` callback fires-and-forgets a
``getAuthToken().then(send AUTH then IDENTIFY)`` sequence. If the
token fetch fails (cache miss + native-host unreachable), we let
the daemon close us; the reconnect loop will retry once the token
is available.
* ``handleMessage``: new ``case "AUTH_OK"`` no-op so the daemon's
ACK is recognised rather than falling into the unknown-type bucket.
* The existing ``DAEMON_HTTP_URL`` fetches (Step 3 of the kill chain,
``/shutdown`` and ``/stop`` already attached the token from Wave-1)
remain — they now align with the same systemic auth gate the WS
protocol enforces.
Test plan: new ``__tests__/systemic_auth.spec.ts`` covers two cases —
the first outbound frame is ``AUTH`` carrying the cached token (index
0 in ``sock.sent``); ``IDENTIFY`` is the next frame (index >0); a
simulated ``AUTH_OK`` + follow-on ``STATE_UPDATE`` reaches the popup
broadcast path. Full ``vitest run``: 33/33 pass across 11 spec files.
5eaef88
merge: Phase I — performance work (capture sub-sample, parallel broadcast, lazy imports, content-script leetcode observer)
# Conflicts:
# cortex/services/api_gateway/websocket_server.py
15cf352
audit Debt-2: desktop_shell WS client AUTHs before IDENTIFY
The WebSocket bridge in the desktop_shell (the WS-mode client, used
when ``cortex.apps.desktop_shell.main`` runs against an out-of-process
daemon) now reads the capability token at startup, caches it, and
sends an ``AUTH`` frame as the FIRST message on every (re)connect —
ahead of ``IDENTIFY``. Without this the AUTH-first gate landed in
Commit 2 would close the desktop_shell's socket immediately on each
reconnect, breaking state-stream UX.
* ``WebSocketBridge.__init__`` loads the token via
``cortex.libs.auth.load_or_create_token`` and stores it on
``self._auth_token``. A read failure is logged (not raised) so an
install with a momentarily-unwritable Application Support dir does
not crash the UI; the AUTH frame is omitted in that path and the
daemon's gate refuses the socket, which is the safer fallback.
* ``_connect_loop`` sends ``{type: "AUTH", payload: {auth_token:
<hex>}}`` immediately after the socket opens, before ``IDENTIFY``.
* ``refresh_auth_token`` exposes a re-read + active-WS-close hook
for the Settings rotation affordance (Commit 5).
Test plan: ``test_desktop_controller_auth.py`` adds two cases —
``test_bridge_caches_token_at_init`` (reading the token file at
construction time, the contract rotation leans on) and
``test_bridge_sends_auth_first_after_connect`` (a fake
``websockets`` module captures every outbound frame; AUTH is index
0, IDENTIFY is index 1). The in-process ``controller.py`` path is
unaffected — it does not speak WS over the wire, so no AUTH frame
is needed.
f16a46a
audit Phase-I: lazy mediapipe + keyring imports for sub-2s startup
Defers two heavyweight imports until the code that actually needs them
runs. Combined with the mediapipe lazy-import already shipped in
commit 1 (capture-loop), this brings daemon import wall-time on a
developer M-series Mac from ~3-4 s anecdotal to ~1.3 s measured —
comfortably inside the 2 s target.
Changes:
- cortex/services/llm_engine/anthropic_planner.py: move `import keyring`
from module top to inside `_keychain_get_bedrock_token` (the only
call site). The function already had a try/except for missing
backends, so the lazy import composes cleanly.
- cortex/scripts/run_dev.py: add `--profile-startup` CLI flag. When
set, `record_milestone(label)` accumulates monotonic timestamps at
key points (entrypoint, config-loaded, server-built,
daemon-task-spawned, daemon-warmup-elapsed) and a formatted table
prints on exit.
Measurement on dev M-series laptop (Python 3.11, warm fs cache):
$ time .venv/bin/python -c "import cortex.scripts.run_dev"
real 0m1.306s
mediapipe lazy-import savings: ~563 ms (verified via `import mediapipe`
timing in a fresh interpreter)
keyring lazy-import savings: ~41 ms
sys.modules after run_dev import:
mediapipe loaded: False
keyring loaded: False
Regression guard: cortex/tests/performance/test_startup_latency.py
runs the import in a fresh subprocess and asserts mediapipe / keyring
are NOT in sys.modules after the import returns. A regression that
re-introduces an eager import fails loudly.
463dd34
audit Phase-I: lazy-load tab manager, content-script-only leetcode observer
Browser-extension bundle hygiene. The current chrome-mv3-prod build
already meets the audit Phase-I target — measured on the parent repo
build/ directory:
Uncompressed total: ~549 KB
Gzipped total: ~175 KB (target: < 250 KB)
Largest file: popup.100f6462.js — 169 KB uncompressed
This commit makes the bundle layout explicit and adds a regression
guard so the size budget is enforced going forward.
Changes:
- leetcode-observer.ts moved from the top-level (where Plasmo
bundles it as a separate content-script entry, but its location
invited future refactors to import it into background.ts) into
contents/ where the convention makes its content-script-only status
visible at file-tree level. Comments referencing the old path
updated in background.ts and activity-tracker.ts.
- tab-manager.ts is the only large module that ships in the
background bundle (background.ts imports its top-level functions
for hide/restore lifecycle hooks). Lazy-loading it via dynamic
import would require restructuring the hot path; the current bundle
is already 30% under target so the speedup is not worth the
hot-path indirection. Documented in test_bundle_size.py.
Regression guard: cortex/tests/performance/test_bundle_size.py pins
per-file source-size budgets for every Plasmo entry point (background,
popup, newtab, tabs/, contents/), plus a total-source-budget assert.
The bundler ratio (~2:1 source:gzip) is stable enough that source
budgets correlate with bundle budgets without needing a build.
Measurements (uncompressed source bytes vs budget):
- background.ts: 127,868 / 200,000 (64%)
- popup.tsx: 44,689 / 80,000 (56%)
- newtab.tsx: 24,636 / 80,000 (31%)
- tab-manager.ts: 14,122 / 60,000 (24%)
- activity-tracker.ts: 29,459 / 40,000 (74%)
- contents/ambient.ts: 13,189 / 40,000 (33%)
- contents/leetcode-observer.ts: 25,044 / 40,000 (63%)
- tabs/onboarding.tsx: 8,603 / 80,000 (11%)
Total entry-point source: 292,527 bytes — 49% of the 600 KB aggregate
budget (well inside the 250 KB compressed target).
d95ae74
audit Debt-2: WebSocket AUTH-first handshake
Promotes the F07 SHUTDOWN-only inline check to a systemic protocol:
every WS connection starts in ``pending_auth`` and the server refuses
every non-``AUTH`` frame until the client presents a valid token. On
success the server sends ``AUTH_OK`` and the client's
``WebSocketClient.authenticated`` flag flips to True; broadcasts skip
peers still in ``pending_auth`` so STATE_UPDATE / INTERVENTION frames
never leak to an unauthenticated origin.
* ``MessageType.AUTH`` and ``MessageType.AUTH_OK`` added to
``cortex/libs/schemas/ws_message_types.py``; codegen regenerated
the generated ``cortex_schemas.d.ts`` so the TS dispatch sites can
narrow against the new literals.
* ``WebSocketClient.authenticated: bool = False`` is the new gate.
Setting the flag is intentionally one-way per connection.
* ``WebSocketServer._dispatch_message`` short-circuits to ``_handle_auth``
on ``AUTH``; every other type while ``pending_auth`` triggers
``close(code=1011, reason="auth required")`` + ``EventType.AUTH_REJECTED``.
* ``_handle_auth`` validates via ``cortex.libs.auth.verify_token``,
sends ``AUTH_OK`` on success, and replays the latest STATE_UPDATE
to preserve the legacy "new connection sees current state" UX.
Replay-safe: a second ``AUTH`` on an authed client re-ACKs without
resetting anything.
* The legacy F07 inline ``verify_token`` call on SHUTDOWN stays as
defense-in-depth (will be made redundant in future cleanup once
Phase H clients are deployed widely).
* ``_broadcast`` skips unauthenticated peers so a connect-and-listen
attack cannot harvest the daemon's state stream.
Test plan: ``test_systemic_auth_ws.py`` adds six integration cases
covering pre-auth refusal, successful handshake + subsequent message
acceptance, wrong-token close, idempotent AUTH replay, AUTH_REJECTED
event content, and a real-listener adversarial smoke test that
asserts the daemon closes a non-AUTHing connection within 2 s. The
existing ``test_auth_local_token.py`` SHUTDOWN test now sends AUTH
first; ``test_ws_slow_client.py``, ``test_correlation_ids.py``, and
the ``test_api_gateway.py::test_send_restore_broadcasts_restore_message``
test mark their mock clients ``authenticated=True`` since they
exercise broadcast / slow-consumer behaviour, not the handshake.
Full ``pytest cortex/tests/`` (excluding the legacy
``test_desktop_shell.py`` mock-pollution suite): 1302 passed.
``python -m cortex.scripts.generate_ts_schemas --check`` exits 0.
78d9d57
audit Phase-I: parallel WS broadcast with hard budget
Replaces the serial `for client in clients: await send(...)` in
`WebSocketServer._broadcast` with `asyncio.wait` over per-client send
tasks under a hard 100 ms total budget. The previous serial loop made
a four-client broadcast cost ~sum(client_latencies); one flaky client
could stretch a single STATE_UPDATE to ~4 s, dropping every queued
30 Hz capture frame behind it.
Changes:
- Per-send timeout bumped 1 s → 2 s (`_BROADCAST_PER_CLIENT_TIMEOUT_S`)
so transient network blips don't get classified as dead consumers.
- Hard total budget capped at 100 ms (`_BROADCAST_BUDGET_S`). When the
budget elapses, unfinished sends are cancelled and the affected
clients are billed as drops for that frame WITHOUT being
disconnected — the per-send timeout remains the disconnect path.
- New `EventType.WS_BROADCAST_SLOW` event with `elapsed_ms`,
`budget_ms`, `client_count`, and `dropped_for_budget` so support
can correlate dropped frames with broadcast budget overruns.
Each send is wrapped in its own `asyncio.create_task` so the budget
cancellation only cancels the pending tasks — already-completed
results are preserved (a naïve `asyncio.gather` cancellation would
have lost them).
Regression guard: cortex/tests/performance/test_broadcast_throughput.py
spins up four fake clients and asserts (a) p95 latency < 100 ms with
healthy clients, (b) four 40 ms clients run in parallel (~40 ms, not
160 ms), (c) budget overflow billed as a drop without disconnect,
(d) per-send timeout actually disconnects.
fc024ef
audit Phase-I: capture-loop mediapipe sub-sampling + colour-convert cache
Three structural wins on the capture pipeline:
1. MediaPipe FaceLandmarker sub-samples at `face_mesh_subsample_n`
(default 2, i.e. 15 Hz at 30 Hz capture). Downstream state/telemetry
loops run at 2 Hz, so 15 Hz mesh stays well above Nyquist while
halving per-frame mediapipe cost. The FaceTracker caches its last
result and replays it on skipped frames so consumers always see a
well-formed FaceTrackingResult.
2. BGR→RGB conversion is computed once per frame in CapturePipeline
and threaded into FaceTracker.process_frame via a new rgb_frame
keyword argument. Previously FaceTracker called cv2.cvtColor itself
on every invocation, duplicating work the pipeline already had.
3. BGR→GRAY conversion is computed once per frame in CapturePipeline
and threaded into FrameQualityScorer.score via a new gray_frame
keyword argument. Previously brightness and blur scoring each ran
cvtColor independently, tripling the conversion cost.
The mediapipe import in face_tracker.py is also deferred (Phase I.4
preview) — `_ensure_mediapipe` performs the real import on first use.
The module-level `mp` attribute is preserved so existing test patches
that referenced it keep working with minor adjustments.
Regression guard: cortex/tests/performance/test_capture_perf.py runs a
1000-frame synthetic harness through the cache-aware paths and asserts
the wall-time stays inside a 5-second budget and that the sub-sample
cache actually skips mediapipe on at least half the frames.
cdf882d
audit Debt-2: server-side capability-token gate on every HTTP route
Promotes the F07/F08 single-endpoint token checks into a systemic
FastAPI dependency that every mutating HTTP route on the API gateway
must clear. ``/health`` keeps its unauthenticated entry so the
supervisor liveness probe still works before the UI has the token.
* ``cortex/services/api_gateway/auth.py`` exports the dependency pair
(``require_capability_token`` raising 401, ``optional_capability_token``
for liveness). Both accept ``Authorization: Bearer <token>`` or the
CORS-preflight-friendlier ``X-Cortex-Auth-Token`` header.
* ``app.py`` mounts ``health_router`` openly and ``router`` with
``dependencies=[Depends(require_capability_token)]``. Adding a new
mutating route automatically inherits the gate; a new liveness-only
route lives on ``health_router`` and is visible in code review.
* ``routes.py`` splits the single ``router`` into ``router`` +
``health_router``; the ``/health`` GET moves to the latter, every
other route stays on ``router`` and now gates on the token.
* ``EventType.AUTH_REJECTED`` and ``EventType.AUTH_TOKEN_ROTATED``
added to ``cortex/libs/logging/structured.py``; the rejection path
emits the former with the request path + cid so log aggregators
can alarm on hostile-localhost scanning spikes.
* The F07 WS SHUTDOWN check stays as defense-in-depth; the F08
launcher /stop check stays (zero-cortex-imports invariant).
Test plan: ``test_systemic_auth_http.py`` adds six cases proving the
gate fires on missing/invalid token, accepts both header forms, leaves
``/health`` open, and emits the structured AUTH_REJECTED event. The
existing ``test_api_gateway.py`` fixture now injects the bearer header
so every existing test continues exercising the route handler (no
behaviour change there). ``test_state_infer_envelope.py`` updated the
same way.
Threat model: closes cross-origin localhost (hostile webpage / extension
in another tab that can speak the protocol but cannot read the mode-0600
auth file). Does not close malware-as-the-user — explicitly named in
``audit/findings.md`` Debt-2 as out of scope.
0fe609a
merge: Phase G — shared schema codegen via pydantic2ts; closes Debt-1 (F42+F43+F44+F45)
# Conflicts:
# .github/workflows/ci.yml
# cortex/apps/browser_extension/popup.tsx
# cortex/services/api_gateway/websocket_server.py
178b8f7
audit Debt-1: CI + pre-commit drift gate (schema codegen)
Locks in the drift contract introduced by the previous four commits.
The Pydantic models in cortex/libs/schemas/ are the single source of
truth; the committed cortex_schemas.d.ts must never go out of sync.
After this commit, that property is enforced at two layers:
1. ``.github/workflows/ci.yml`` adds a ``schema-codegen-check`` job
that:
- Sets up Python 3.11 and Node 20.
- Installs ``json-schema-to-typescript`` globally (for the
``json2ts`` binary).
- Installs ``pip install -e ./cortex[codegen]``.
- Runs ``python -m cortex.scripts.generate_ts_schemas --check``.
Exit 1 (drift) prints the diff and fails the job; PRs cannot
merge until the developer regenerates and commits the result.
Concurrency-cancel saves runner minutes on PR pushes.
2. ``.pre-commit-config.yaml`` adds a local hook firing on any
change to:
- ``cortex/libs/schemas/*.py`` (Pydantic source side)
- ``cortex/scripts/generate_ts_schemas.py`` (generator itself)
- ``cortex/apps/browser_extension/types/generated/
cortex_schemas.d.ts`` (so manual hand-edits are rejected)
The hook executes the same ``--check`` command; pre-commit captures
stdout and prints the diff on failure so the developer immediately
sees what changed.
Documentation:
- ``CLAUDE.md`` gains a ``Schema Codegen (audit Debt-1)`` section
with regeneration, install, and add-a-new-WS-type recipes.
(CLAUDE.md is project-local and not committed; the master file at
the repo root carries this section.)
Manual verification of the gate (run during this commit):
# Tamper:
sed -i '' '1s|^|// TAMPER\n|' .../cortex_schemas.d.ts
python -m cortex.scripts.generate_ts_schemas --check # exit 1, prints diff
# Restore:
git checkout .../cortex_schemas.d.ts
python -m cortex.scripts.generate_ts_schemas --check # exit 0
Final shape of Debt-1 closure:
- F42 closed structurally: SuggestedAction.action_type is the generated
Literal union; TS exhaustiveness in executeAction's switch.
- F43 closed structurally: SuggestedAction.catalog_id present in the
generated type; consumers can reference it.
- F44 closed structurally: ActionExecuteResult.reversible aligned with
Pydantic's canonical name.
- F45 closed structurally: WSMessage.type is the generated MessageType
union; daemon-side Pydantic rejects typos at construction; extension-
side TS compiler rejects them at dispatch.
The drift cannot recur without:
- A developer with ``--no-verify`` bypassing pre-commit AND
- Their PR being merged without the required CI check passing.
Both require explicit, auditable action.
1a6592d
audit Debt-1: migrate browser extension to generated schema types
End-to-end close-out for F42/F43/F44/F45. The hand-written TypeScript
interfaces in ``background.ts`` / ``popup.tsx`` / ``leetcode-observer.ts``
are replaced with imports from the generated
``./types/generated/cortex_schemas`` module — the Pydantic models in
``cortex/libs/schemas/`` are now the single source of truth for every
schema-equivalent shape that crosses the daemon ↔ extension boundary.
What lands here per-file:
``background.ts``:
- Deletes the hand-written ``interface WSMessage`` and
``interface SuggestedAction``. Imports both from the generated file.
- ``WSMessage`` is wrapped locally with an ``Omit<…, "payload">`` to
promote ``payload`` to ``Record<string, unknown>`` (always emitted on
the wire; the JSON-schema mode marks it optional only because it has
a Python-side default factory).
- ``SuggestedAction`` is wrapped locally to promote default-factory
fields (``action_id``, ``target``, ``label``, ``reason``,
``category``, ``reversible``, ``metadata``) to required. Genuinely-
optional fields (``tab_index``, ``group_id``, ``catalog_id``) stay
optional.
- ``ActionExecuteResult.undo_available`` renamed to ``reversible`` to
match Pydantic's canonical name (F44 closure). Every of the 33
construction sites updated; no consumer reads the old name.
- ``executeAction``'s switch gets a ``const unhandled: never =
action.action_type`` exhaustiveness check. A new ``action_type``
literal in the Pydantic catalogue now fails this file's tsc step
until a matching case is added (F42 closure).
``popup.tsx``:
- Imports ``SuggestedAction``, ``TabRecommendation``,
``TabRecommendations`` from the generated file.
- ``tabRecs`` state is now ``TabRecommendations | null`` (was the
ad-hoc ``{ tabs: Record<…>[]; summary: string }``).
- ``synthesizeActions`` types its ``action_type`` synthesis against
the generated ``SuggestedAction["action_type"]`` union, so a Pydantic
rename fails compile (F42).
``leetcode-observer.ts``:
- Imports ``LeetCodeContext``, ``LeetCodeStage``, ``SubmissionResult``
from the generated file; drops the hand-written
``LeetCodeContextPayload`` interface.
- Bug fix surfaced by migration: ``normalizeResult`` was returning
``"TLE"`` / ``"MLE"`` strings the Pydantic schema does not accept
(it requires ``"Time Limit Exceeded"`` / ``"Memory Limit
Exceeded"``). The strings were silently dropped by the daemon. Fixed
to emit the canonical literal — typed return makes a future drift
surface at compile time.
- ``lastSubmissionResult`` typed as ``SubmissionResult | null``.
Daemon-side additions required to make the migration end-to-end:
``cortex/libs/schemas/ws_message_types.py``:
- Adds 15 ``LEETCODE_*`` members to the ``MessageType`` catalogue.
These were already on the wire (emitted by ``LeetCodeAdapter`` via
``WebSocketServer.send_message``) but absent from the enum, which
would have failed the new Pydantic validator at construction time.
Each member is annotated with the action it carries so the wire
contract is documented inline.
``cortex/tests/unit/test_ws_message_schema.py``:
- New ``test_message_type_catalog_covers_leetcode_adapter_emissions``
pins the LEETCODE_* set into the catalog. Adding a new
``_CAPABILITIES`` entry to ``LeetCodeAdapter`` without extending
``MessageType`` fails this test.
``cortex/apps/browser_extension/types/generated/cortex_schemas.d.ts``:
- Regenerated to include the LEETCODE_* members and the 4-letter
alphabetised order.
Verification:
cd cortex/apps/browser_extension
pnpm install --ignore-workspace
./node_modules/.bin/tsc --noEmit # 0 errors
cd ../../..
python -m cortex.scripts.generate_ts_schemas --check # exit 0
pytest cortex/tests/unit/test_ws_message_schema.py # 13 passed
pytest cortex/tests/unit/test_leetcode_adapter.py # 9 passed
pytest cortex/tests/unit/test_api_gateway.py # 41 passed
F42/F43/F44/F45 close structurally here: every consumer reads from
the generated type, the CI gate (Commit 5) makes drift impossible,
and the exhaustiveness check inside ``executeAction`` makes a missed
``action_type`` case fail compilation rather than ship.
f146c8f
audit-w2: append UI consistency reconciliation report
Records the 6 Wave-2 commits (warm-label tokens, FONT_MONO, window-
chrome regression guard, raw-int spacing, a11y on settings+connections+
onboarding, popup-toggle token routing), the per-dimension verdict for
all 8 audit dimensions, the surfaces audited matrix, the verification
runs (1150 unit + 35 UI + 31 vitest), and three residual-risk items
(no loading skeleton on briefing/activity, no fade on functional
notifications, pre-existing test_desktop_shell mock pollution).
c1aafe4
merge: Wave 2-B (audit-w2 4 commits) — pipeline/architecture consistency
6d9b1e9
audit Debt-1: regenerate cortex_schemas.d.ts with WSMessage + MessageType
Re-runs ``python -m cortex.scripts.generate_ts_schemas`` so the
committed ``cortex_schemas.d.ts`` includes the two new schemas
introduced in Commit 2: ``WSMessage`` (Pydantic envelope) and
``MessageType`` (the 20-member dispatch-literal enum).
No source-side changes — this commit is the generated output only.
The drift gate now reports the file in sync.
Verification:
python -m cortex.scripts.generate_ts_schemas --check # exit 0
grep "export type MessageType" cortex_schemas.d.ts # present
grep "export interface WSMessage" cortex_schemas.d.ts # present
Commit 4 migrates ``background.ts`` / ``popup.tsx`` to import these
generated types and drops the hand-written interfaces (closing F42/
F43/F44/F45 on the consumer side). Commit 5 adds the CI + pre-commit
drift gate.
b053343
audit-w2: route popup toggle radius + transitions through tokens
popup.tsx hardcoded borderRadius: 12 on the toggle track (half-height
pill) and "#fff" on the thumb, plus three '0.2s ease' transition
literals. The hex matched CX.textInverse and 200ms ease matched
CX.durationNormal + CX.easeDefault, but the literals would drift if a
future motion-curve or palette change lands in tokens.yaml.
Promoted:
- toggleTrack.borderRadius -> CX.radiusFull (still clamps to half-height
since 9999 >> half of 24px)
- toggleThumb.background -> CX.textInverse
- toggleTrack/toggleThumb/connectBtn transition durations -> CX.durationNormal
with CX.easeDefault curve
Verified all 31 extension specs (vitest, jsdom) stay green.
4bd1687
audit-w2: align Architecture.md port table + service list with code
The repository map listed six service directories; cortex/services
actually contains fifteen. Architecture.md mentioned ports 9472 and
9473 in prose but never the launcher agent's 9471, which is part of
the documented kill chain (CLAUDE.md §13). Add an explicit ports table
naming all three ports and what binds them, extend the service list to
cover capture_service, kinematics_engine, telemetry_engine,
context_engine, session_report, api_gateway, launcher, janitor,
activity_tracker, handover, and throttle, and add a regression-guard
test that re-reads the markdown after every commit to keep doc drift
from re-opening this gap.
3e25991
audit Debt-1: promote WSMessage to Pydantic + introduce MessageType enum
The WebSocket envelope was a dataclass with a free-form ``type: str``,
which let F45 happen: typos in dispatch arms silently bypassed handlers
and the extension's hand-written TS interface (``background.ts:23``)
drifted independently. This commit makes the Pydantic model the source
of truth for the codegen pipeline.
Pieces:
- ``cortex/libs/schemas/ws_message.py`` — new ``WSMessage(BaseModel)``
with the exact field set from the legacy dataclass. ``use_enum_values
=True`` keeps ``msg.type`` a plain string at runtime so existing
``if msg.type == "STATE_UPDATE"`` dispatch sites work unchanged.
``extra="ignore"`` keeps wire forwards-compat with future field
additions.
- ``cortex/libs/schemas/ws_message_types.py`` — new ``MessageType(str,
Enum)`` catalogue. Membership policy is documented inline: a literal
is in the enum if the daemon either emits it or dispatches on it.
Currently 20 members (11 inbound, 9 outbound + 2 daemon-internal).
- ``cortex/services/api_gateway/websocket_server.py``:
* ``WSMessage`` rebound to the Pydantic model so the api_gateway's
public surface (``from cortex.services.api_gateway import
WSMessage``) is the new model — the existing tests' construct/
parse calls work unchanged.
* Legacy dataclass preserved as ``WSMessageLegacy`` with an
explicit ``to_pydantic()`` helper for the one-release deprecation
window, per the migration plan.
* ``_process_message`` dispatch arms use ``MessageType.X.value``
so a renamed enum member is a Python-side compile error.
* Outbound ``_make_*`` and ``send_restore``/``broadcast_settings``/
``request_context`` now pass ``MessageType`` members instead of
raw strings.
* ``ValidationError`` is now a parse failure mode and is caught
alongside ``JSONDecodeError`` / ``KeyError`` — clients sending
unknown types see the same drop-and-log behaviour they would
have seen for malformed JSON, but the daemon no longer routes
on uncatalogued literals (F45 closure structural condition).
- ``cortex/libs/schemas/__init__.py`` — re-exports ``WSMessage`` and
``MessageType`` so they participate in the codegen walk.
Tests (``cortex/tests/unit/test_ws_message_schema.py``, 12 cases):
1. Catalog membership covers every inbound dispatched literal.
2. Catalog membership covers every outbound emitted literal.
3. Construction via string and enum forms both produce wire-string
types (use_enum_values contract).
4. Unknown ``type`` literal at construction → ValidationError.
5. Pydantic round-trip preserves every field.
6. Legacy dataclass → JSON → Pydantic round-trips structurally.
7. Explicit ``WSMessageLegacy.to_pydantic()`` helper covered.
8. Wire JSON shape matches between legacy and new (same key set).
9. Representative captures (STATE_UPDATE, INTERVENTION_TRIGGER,
USER_ACTION, SHUTDOWN) replay through the new model without error.
10. Default payload is ``{}`` (legacy behaviour preserved).
11. Forward-compat field is ignored without crash.
Verification:
pytest cortex/tests/unit/test_ws_message_schema.py # 12 passed
pytest cortex/tests/unit/test_api_gateway.py # 41 passed (no regression)
pytest cortex/tests/ # 1053 passed (full suite)
The codegen file ``cortex_schemas.d.ts`` is intentionally NOT regenerated
in this commit. ``python -m cortex.scripts.generate_ts_schemas --check``
now reports drift (new WSMessage + MessageType types) — that's the
expected state. Commit 3 regenerates; Commit 4 migrates the extension;
Commit 5 wires the CI + pre-commit drift gate.
9d5e11f
audit-w2: persist real last-escalation age instead of 0-clamped sign-flipped delta
F26's _persist_quiet_mode_history wrote
``self._quiet_mode_count_reset_at - time.monotonic()`` to the
``last_escalation_at_monotonic_delta`` field. Because reset_at is the
monotonic timestamp of the last escalation (in the past), the
expression was always non-positive and the surrounding
``max(0.0, ...)`` clamped it to 0. Rehydrate then stamped reset_at to
``time.monotonic()`` — i.e. "last escalation was just now" regardless
of when it actually fired. The field is currently diagnostic only, so
no consumer-visible behaviour changed, but any future caller trusting
the value would have inherited the bug. Rename to
``last_escalation_age_seconds``, flip the sign, and accept both
field names in the loader so an upgrade-in-place does not lose state.
bb4be0a
audit-w2: re-consult cost kill switch between LLM retry attempts
F20 consulted CostTracker.check_budget() only at call entry. F30's
cancellation-cost path can bill mid-call, and a successful but
token-heavy first attempt followed by a retry triggers the same gap:
the loop continues to attempts 2 and 3 with no further budget check.
Re-consult the ceiling at the top of every retry; on KILL serve the
deterministic fallback stamped with budget_killed_on_retry so an
operator can distinguish call-start kills from mid-retry kills.
425993b
audit-w2: accessible names + tab order on settings, connections, onboarding
F55 wired accessibility on the dashboard + overlay. The other three
top-level surfaces (settings dialog, connections panel, onboarding
wizard) still announced every control as 'button' / 'checkbox' to
VoiceOver and let focus escape the window unpredictably.
Extracts dashboard.py + overlay.py's defensive a11y wrappers into a
new shared module (cortex/apps/desktop_shell/a11y.py — within the
audit file budget) exposing set_accessible_name,
set_accessible_description, set_tab_order, chain_tab_order. The
helpers no-op cleanly under the lightweight mock PySide6 stubs that
the legacy mock suite installs.
Wired into:
- settings.py: 16 controls (back, 6 checkboxes, slider, 2 spinboxes,
combo, 4 debug checkboxes, close, apply) + a 15-step tab chain.
- connections.py: back button + every Connect <browser> / Connect
<editor> button collected into _tab_order_chain and chained.
- onboarding.py: BYOK key input + region combo + save button + Open
Connections + Get Started, plus per-step Grant buttons + status
pills get programmatic accessible names. Tab chain pins focus on
the BYOK card → Connect Extensions → Get Started.
Adds cortex/tests/unit/test_a11y_coverage.py — 3 cases instantiate
each panel under offscreen Qt and assert accessible names on every
key interactive widget. F55 dashboard + overlay tests stay green.
bdff047