Related to #5: Fixed bounds checking in RadioButtonSelect component to ensure active…#12
Closed
little-huang wants to merge 1 commit into
Closed
Related to #5: Fixed bounds checking in RadioButtonSelect component to ensure active…#12little-huang wants to merge 1 commit into
little-huang wants to merge 1 commit into
Conversation
…Index and initialIndex are in valid range to avoid accessing out-of-bounds items.
Author
|
see: #5 |
Collaborator
|
Thank you for your support and quick pull request! We have just merge one of the solutions in #46 . We have just released a hotfix to resolve this issue. Please update to the latest version by running: Then, verify the installation by running Thank you again for your feedback! |
ranpox
pushed a commit
that referenced
this pull request
Aug 1, 2025
Methods should be verbs. Fixes #4.
halfaipg
pushed a commit
to AIPowerGrid/grid-code
that referenced
this pull request
Aug 16, 2025
Methods should be verbs. Fixes QwenLM#4.
DragonnZhang
pushed a commit
that referenced
this pull request
Apr 30, 2026
fix(scripts): Fix Windows build scripts compatibility with bun
doudouOUC
added a commit
that referenced
this pull request
May 18, 2026
Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5 review pass. Tests deferred to fold-in 10 (separate, larger commit). CRITICAL CORRECTNESS #7 — `provider.persist()` Promise.race could publish `persist_failed` to SSE while a non-cooperative provider was still committing credentials to disk. Added an independent tracker on the original persist promise: if the race timed out (`persistTimedOut === true`) AND the underlying persist later resolved successfully, audit a `lost_success_after_timeout` breadcrumb so operators see the inconsistency. Tightened the persist `@remarks` contract to require signal honoring end-to-end. Qwen provider already complies (fold-in 3 #10); this is forward-defense for future providers. #11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`, `createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event / data / state types) was re-exported from `src/daemon/index.ts` but NEVER from the published SDK entry `src/index.ts`. SDK consumers got `undefined` for everything except `client.auth.start()` (which traveled through the already-exported `DaemonClient`). Added the missing exports and pinned via `daemon-public-surface.test.ts`. #12 — `core/src/qwen/qwenOAuth2.ts:373`'s `debugLogger.debug('Device authorization result:', result)` writes the raw `device_code` (RFC 8628 bearer-equivalent credential) to stderr / journald, bypassing the `BrandedSecret` redaction layer. Pre-existing on main but PR 21 expanded the exposure surface. Sanitized to log only `{ ok, expires_in }` on success / `{ ok, error }` on error. #13 — `runPollTick` success-branch persist-failure × past-`expiresAt` classified as `expired_token` instead of `persist_failed`, routing operators toward "tell user to retry" (RFC 8628 expiry) when the actual root cause was disk I/O. Reclassified to `persist_failed` with a `persist_also_failed_past_expiry` audit hint to preserve the timing detail for incident response. SMALL CORRECTNESS #1 — `runPollTick` catch hint replaced with a STATIC bounded message ("provider.poll() failed; see daemon audit log for details"). The fold-in 8 truncated-prefix approach could still leak the first 256 chars of provider-templated raw text including secret material. Full raw still routed to audit channel for operator visibility. #5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred- cancel branch in `cancel()` now stamps it on the entry, and the persist-resolution `cancelled` event publish uses `entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers that suppress self-emitted events can now attribute the cancel correctly. #6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented "settle immediately, return current daemon view" contract) was treated as falsy by the `?` ternary, falling back to the default. `sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling computation uses `!== undefined` instead of truthy check. #8 — `EventBus.publish()` returns `undefined` for closed buses (it does NOT throw). `broadcastWorkspaceEvent` previously counted that path as success, hiding the all-buses-dropped operator alarm. Folded the closed-bus-as-failure check into the canonical `publishWorkspaceEvent` (see #X below). #9 — start-timeout Promise.race rejected with a plain `Error`, falling through `sendBridgeError` to a generic 500. Switched to `UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502 (matching the envelope every other IdP start failure uses). STRUCTURAL #3 — Three identical `transitionTerminal + publish + audit` expired_token blocks in `runPollTick`/`sweep`/(removed by #13) deduplicated into a private `expireEntry()` helper. Future event- shape changes are now a one-edit operation. #X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent` was kept distinct only to avoid the merge conflict; once PR 16 landed, it became a fold-in candidate. Folded the closed-bus + all-failed-stderr-escalation operator-visibility features (PR 21's S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped `broadcastWorkspaceEvent` from the bridge interface + impl + test mocks. PR 21's deviceFlowEventSink now calls `bridge.publishWorkspaceEvent` — single canonical workspace fan-out. DOC #16 — Added a "Cross-client take-over" paragraph to `docs/users/qwen-serve.md` explaining that two clients on the same daemon for the same provider get the per-provider singleton with `attached: true`/`false` distinguishing them; no separate event fires (both eventually observe the same `auth_device_flow_authorized`). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
doudouOUC
added a commit
that referenced
this pull request
May 18, 2026
Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5 review pass. Tests deferred to fold-in 10 (separate, larger commit). CRITICAL CORRECTNESS #7 — `provider.persist()` Promise.race could publish `persist_failed` to SSE while a non-cooperative provider was still committing credentials to disk. Added an independent tracker on the original persist promise: if the race timed out (`persistTimedOut === true`) AND the underlying persist later resolved successfully, audit a `lost_success_after_timeout` breadcrumb so operators see the inconsistency. Tightened the persist `@remarks` contract to require signal honoring end-to-end. Qwen provider already complies (fold-in 3 #10); this is forward-defense for future providers. #11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`, `createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event / data / state types) was re-exported from `src/daemon/index.ts` but NEVER from the published SDK entry `src/index.ts`. SDK consumers got `undefined` for everything except `client.auth.start()` (which traveled through the already-exported `DaemonClient`). Added the missing exports and pinned via `daemon-public-surface.test.ts`. #12 — `core/src/qwen/qwenOAuth2.ts:373`'s `debugLogger.debug('Device authorization result:', result)` writes the raw `device_code` (RFC 8628 bearer-equivalent credential) to stderr / journald, bypassing the `BrandedSecret` redaction layer. Pre-existing on main but PR 21 expanded the exposure surface. Sanitized to log only `{ ok, expires_in }` on success / `{ ok, error }` on error. #13 — `runPollTick` success-branch persist-failure × past-`expiresAt` classified as `expired_token` instead of `persist_failed`, routing operators toward "tell user to retry" (RFC 8628 expiry) when the actual root cause was disk I/O. Reclassified to `persist_failed` with a `persist_also_failed_past_expiry` audit hint to preserve the timing detail for incident response. SMALL CORRECTNESS #1 — `runPollTick` catch hint replaced with a STATIC bounded message ("provider.poll() failed; see daemon audit log for details"). The fold-in 8 truncated-prefix approach could still leak the first 256 chars of provider-templated raw text including secret material. Full raw still routed to audit channel for operator visibility. #5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred- cancel branch in `cancel()` now stamps it on the entry, and the persist-resolution `cancelled` event publish uses `entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers that suppress self-emitted events can now attribute the cancel correctly. #6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented "settle immediately, return current daemon view" contract) was treated as falsy by the `?` ternary, falling back to the default. `sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling computation uses `!== undefined` instead of truthy check. #8 — `EventBus.publish()` returns `undefined` for closed buses (it does NOT throw). `broadcastWorkspaceEvent` previously counted that path as success, hiding the all-buses-dropped operator alarm. Folded the closed-bus-as-failure check into the canonical `publishWorkspaceEvent` (see #X below). #9 — start-timeout Promise.race rejected with a plain `Error`, falling through `sendBridgeError` to a generic 500. Switched to `UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502 (matching the envelope every other IdP start failure uses). STRUCTURAL #3 — Three identical `transitionTerminal + publish + audit` expired_token blocks in `runPollTick`/`sweep`/(removed by #13) deduplicated into a private `expireEntry()` helper. Future event- shape changes are now a one-edit operation. #X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent` was kept distinct only to avoid the merge conflict; once PR 16 landed, it became a fold-in candidate. Folded the closed-bus + all-failed-stderr-escalation operator-visibility features (PR 21's S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped `broadcastWorkspaceEvent` from the bridge interface + impl + test mocks. PR 21's deviceFlowEventSink now calls `bridge.publishWorkspaceEvent` — single canonical workspace fan-out. DOC #16 — Added a "Cross-client take-over" paragraph to `docs/users/qwen-serve.md` explaining that two clients on the same daemon for the same provider get the per-provider singleton with `attached: true`/`false` distinguishing them; no separate event fires (both eventually observe the same `auth_device_flow_authorized`). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
doudouOUC
added a commit
that referenced
this pull request
May 18, 2026
Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5 review pass. Tests deferred to fold-in 10 (separate, larger commit). CRITICAL CORRECTNESS #7 — `provider.persist()` Promise.race could publish `persist_failed` to SSE while a non-cooperative provider was still committing credentials to disk. Added an independent tracker on the original persist promise: if the race timed out (`persistTimedOut === true`) AND the underlying persist later resolved successfully, audit a `lost_success_after_timeout` breadcrumb so operators see the inconsistency. Tightened the persist `@remarks` contract to require signal honoring end-to-end. Qwen provider already complies (fold-in 3 #10); this is forward-defense for future providers. #11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`, `createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event / data / state types) was re-exported from `src/daemon/index.ts` but NEVER from the published SDK entry `src/index.ts`. SDK consumers got `undefined` for everything except `client.auth.start()` (which traveled through the already-exported `DaemonClient`). Added the missing exports and pinned via `daemon-public-surface.test.ts`. #12 — `core/src/qwen/qwenOAuth2.ts:373`'s `debugLogger.debug('Device authorization result:', result)` writes the raw `device_code` (RFC 8628 bearer-equivalent credential) to stderr / journald, bypassing the `BrandedSecret` redaction layer. Pre-existing on main but PR 21 expanded the exposure surface. Sanitized to log only `{ ok, expires_in }` on success / `{ ok, error }` on error. #13 — `runPollTick` success-branch persist-failure × past-`expiresAt` classified as `expired_token` instead of `persist_failed`, routing operators toward "tell user to retry" (RFC 8628 expiry) when the actual root cause was disk I/O. Reclassified to `persist_failed` with a `persist_also_failed_past_expiry` audit hint to preserve the timing detail for incident response. SMALL CORRECTNESS #1 — `runPollTick` catch hint replaced with a STATIC bounded message ("provider.poll() failed; see daemon audit log for details"). The fold-in 8 truncated-prefix approach could still leak the first 256 chars of provider-templated raw text including secret material. Full raw still routed to audit channel for operator visibility. #5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred- cancel branch in `cancel()` now stamps it on the entry, and the persist-resolution `cancelled` event publish uses `entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers that suppress self-emitted events can now attribute the cancel correctly. #6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented "settle immediately, return current daemon view" contract) was treated as falsy by the `?` ternary, falling back to the default. `sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling computation uses `!== undefined` instead of truthy check. #8 — `EventBus.publish()` returns `undefined` for closed buses (it does NOT throw). `broadcastWorkspaceEvent` previously counted that path as success, hiding the all-buses-dropped operator alarm. Folded the closed-bus-as-failure check into the canonical `publishWorkspaceEvent` (see #X below). #9 — start-timeout Promise.race rejected with a plain `Error`, falling through `sendBridgeError` to a generic 500. Switched to `UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502 (matching the envelope every other IdP start failure uses). STRUCTURAL #3 — Three identical `transitionTerminal + publish + audit` expired_token blocks in `runPollTick`/`sweep`/(removed by #13) deduplicated into a private `expireEntry()` helper. Future event- shape changes are now a one-edit operation. #X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent` was kept distinct only to avoid the merge conflict; once PR 16 landed, it became a fold-in candidate. Folded the closed-bus + all-failed-stderr-escalation operator-visibility features (PR 21's S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped `broadcastWorkspaceEvent` from the bridge interface + impl + test mocks. PR 21's deviceFlowEventSink now calls `bridge.publishWorkspaceEvent` — single canonical workspace fan-out. DOC #16 — Added a "Cross-client take-over" paragraph to `docs/users/qwen-serve.md` explaining that two clients on the same daemon for the same provider get the per-provider singleton with `attached: true`/`false` distinguishing them; no separate event fires (both eventually observe the same `auth_device_flow_authorized`). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
doudouOUC
added a commit
that referenced
this pull request
May 18, 2026
* feat(serve): auth device-flow route
Implements issue #4175 Wave 4 PR 21. Brokers OAuth 2.0 Device
Authorization Grant (RFC 8628) through the `qwen serve` daemon so a
remote SDK client can trigger a Qwen-account login whose tokens land
on the **daemon** filesystem, not on the client. The daemon polls the
IdP itself; the client's only job is to display the verification URL +
user code.
Runtime locality (#4175 §11): the daemon NEVER spawns a browser or
calls `open(url)` — even when running locally. Static-source grep
test fails the build on `node:child_process` / `open` / `xdg-open` /
`shell.openExternal` / `execa` / `shelljs` / `process.spawn` and
their dynamic-import / require variants.
- `POST /workspace/auth/device-flow` — strict mutation gate; returns
201 fresh / 200 idempotent take-over with `attached: true`. Per
per-`providerId` singleton: a second POST while pending takes over
rather than allocating a new `device_code`.
- `GET /workspace/auth/device-flow/:id` — public state read. Pending
entries echo `userCode/verificationUri/expiresAt/intervalMs`;
terminal entries (5-min grace) drop them and surface
`status/errorKind/hint`.
- `DELETE /workspace/auth/device-flow/:id` — strict; idempotent
(terminal → 204 no-op; unknown → 404).
- `GET /workspace/auth/status` — pending flows + supported providers
snapshot. v1 stub for `providers: []` (populated in fold-in 1).
`DeviceFlowRegistry` (`packages/cli/src/serve/auth/deviceFlow.ts`)
is the in-memory state holder:
- per-`providerId` singleton with idempotent take-over
- workspace-wide cap of 4 active flows (abuse defense)
- 5-min terminal grace so SDK reconnects can still observe results
- TTL sweeper evicts grace-expired entries every 30s
- in-flight `Promise` map coalesces concurrent `start()` calls so two
parallel POSTs don't double-allocate IdP `device_code`
- `transitionTerminal` returns `boolean` so caller-side emit/audit
guard prevents sweeper × poll-tick double-fire
- `dispose()` wired into `runQwenServe.close()`'s shutdown drain;
cancels `provider.poll()` mid-flight via `cancelController`,
records `lost_success` audit when an IdP-minted token is dropped
by transition
`DeviceFlowProvider` interface accepts `start({signal})` +
`poll(state, {signal})`. `QwenOAuthDeviceFlowProvider` wraps the
existing `QwenOAuth2Client.requestDeviceAuthorization` /
`pollDeviceToken` primitives directly (NOT
`authWithQwenDeviceFlow`, which calls `open(url)`). PKCE is
provider-required by Qwen but optional in the interface for future
non-PKCE providers. `success.persist()` writes to disk FIRST, then
updates the in-process client — a failed disk write no longer
leaves the daemon with a zombie in-memory token. Maps RFC 8628
errors via an anchored regex (`^Device token poll failed:
(expired_token|access_denied|invalid_grant)`) so an
`error_description` containing one of those literals can't
mis-classify an unrelated upstream error.
`BrandedSecret<T extends string>` holds the `device_code` and PKCE
verifier. Earlier draft used `new String()` wrapper which leaked
through `+` / template literals (`Symbol.toPrimitive` →
`valueOf` returned the primitive). Final shape: frozen plain object
+ `WeakMap` indirection + 4-way redaction
(`toString` / `toJSON` / `Symbol.toPrimitive` / numeric coercion →
`'[redacted]'` or `NaN`) + `unique symbol` brand. 6 leak-path
tests: `JSON.stringify` / `String()` / concat / template / `+x` /
reveal-roundtrip.
5 new daemon events (workspace-scoped, fanned out to every active
session bus via `bridge.broadcastWorkspaceEvent`):
- `auth_device_flow_started` — `{deviceFlowId, providerId, expiresAt}`
(no userCode/verificationUri — see PR 21 design §3)
- `auth_device_flow_throttled` — `{deviceFlowId, intervalMs}`,
emitted only on upstream `slow_down` interval bumps
- `auth_device_flow_authorized` — `{deviceFlowId, providerId,
expiresAt?, accountAlias?}`; `accountAlias` is best-effort
non-PII (never email/phone)
- `auth_device_flow_failed` — `{deviceFlowId, errorKind, hint?}`
with `errorKind ∈ {expired_token, access_denied, invalid_grant,
upstream_error, persist_failed}`
- `auth_device_flow_cancelled` — `{deviceFlowId}` (DELETE on pending)
Workspace-scoped reducer `reduceDaemonAuthEvent` produces
`DaemonAuthState { flows: Partial<Record<ProviderId, ...>> }` —
parallel to `reduceDaemonSessionEvent`. Session reducer no-ops on
auth events (workspace-scoped state belongs in its own reducer).
`bridge.broadcastWorkspaceEvent` is intentionally distinct from PR
16's `publishWorkspaceEvent` to avoid merge conflict; collapses to
the shared helper as a fold-in once #4249 lands (~25 LoC).
`@qwen-code/sdk` (`packages/sdk-typescript/`):
- 4 new `DaemonClient` methods: `startDeviceFlow`, `getDeviceFlow`,
`cancelDeviceFlow`, `getAuthStatus` — typed against the wire
shapes, errors mapped through the existing `DaemonHttpError`.
- High-level `client.auth` getter (lazy `DaemonAuthFlow` singleton)
exposes a `start(...).awaitCompletion()` shape mirroring `gh auth
login`'s UX: print code first, let the SDK consumer decide where
to open the browser. `awaitCompletion` polls GET on the
daemon-supplied `intervalMs`, honors `slow_down` bumps, and
fall-back-recovers from 404 (entry evicted post-grace).
POST + DELETE flow through PR 15's `mutate({strict: true})` —
401 `token_required` on token-less loopback defaults. GET routes
use only the global `bearerAuth`. Every state transition
(`started/authorized/failed/cancelled/expired/lost_success`)
records a structured stderr breadcrumb (`[serve] auth.device-flow:
provider=... deviceFlowId=abc12... clientId=... status=...`)
since `mutate()` doesn't carry an audit hook — events alone aren't
enough since SDK can silently drop them; stderr → journald/docker
logs is the unfalsifiable record.
`auth_device_flow` advertised unconditionally on
`/capabilities.features`. Supported providers list lives on
`/workspace/auth/status` to keep the registry descriptor uniform.
- `packages/core/src/qwen/qwenOAuth2.ts`:
- exports `cacheQwenCredentials` (was a private function; needed
by the daemon's device-flow registry)
- `cacheQwenCredentials` now calls `SharedTokenManager.clearCache()`
after writing, folding what was previously a paired call site at
L820+L829. Idempotent change.
- file mode `0o600` on `oauth_creds.json` (was default 0o666 +
umask). Mirrors opencode's `auth/index.ts`.
- `packages/cli/src/serve/runQwenServe.ts`: device-flow registry
`dispose()` wired into the shutdown drain (BEFORE
`bridge.shutdown()`).
- `auth/deviceFlow.test.ts` — 21 tests: BrandedSecret leak paths,
state machine (slow_down / success / error), terminal grace,
concurrent-start coalescing, dispose, cancel idempotency, static-
source grep against browser-spawn primitives.
- `server.test.ts` — 10 device-flow integration tests:
POST 201/200 take-over, strict 401, 400 `unsupported_provider`,
GET / DELETE / `/workspace/auth/status`, 502 `upstream_error`
mapping, sweeper-driven auto-expiry with controlled clock,
capability advertisement.
- `daemonEvents.test.ts` — 5 SDK reducer tests: type guards, per-
provider state projection, `failed` always → `status: 'error'`
(errorKind carries the kind, including new `persist_failed`),
session reducer no-ops on auth events.
369/369 serve + SDK tests pass; typecheck + `eslint
--max-warnings 0` clean across 14 PR 21 files.
- [x] Independently mergeable (depends only on merged PR 4 / PR 7 /
PR 12 / PR 15)
- [x] Backward compatible (4 new routes + 1 capability tag + 5 typed
events + 4 SDK helpers; existing routes/events untouched)
- [x] Default off (capability advertised but no client is forced to
use it; CLI `qwen` OAuth flow unchanged)
- [x] `qwen serve` Stage 1 routes / SDK behavior preserved
- [x] Gradual migration (v1 only `qwen-oauth`; future providers
register through the `DeviceFlowProvider` interface)
- [x] Reversible (revert removes 4 routes + 1 tag + 5 events with no
schema migration)
- [x] Tests-first (28 new tests across 3 layers)
- Inline `bridge.broadcastWorkspaceEvent` → fold-in to PR 16 (#4249)
`publishWorkspaceEvent` once that lands
- `/workspace/auth/status` vs PR 12 `/workspace/providers` boundary
— separate route in v1; merge alternative discussed
- Wave 4 PRs 17/19/20 should adopt the same mutate-strict +
workspace event-fan-out pattern
5 items from pre-PR specialist passes parked for a focused
follow-up: `DeviceFlowEntry` discriminated union, single-source SDK
status / ProviderId unions, `awaitCompletion` memoization,
broadcast-100%-fail stderr elevation, SDK 404 →
`not_found_or_evicted` errorKind.
Refs: #4175
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 round-1 review feedback
Eleven items from copilot-pull-request-reviewer's round-1 pass on
#4255 — 4 inline threads + 7 from the PR-level review summary.
## Adopted (11 items, code/doc changes)
- **`lastSeenAt` → `lastSeenEventId`** (`events.ts`,
`DaemonDeviceFlowReducerState`). The field was set from
`rawEvent.id` (SSE event id) but documented as "epoch ms" — a real
semantic mismatch that would mislead consumers into time-based
logic against a monotonic counter. Rename + tighten the JSDoc to
describe it as an event-id counter; reducer cases updated.
- **`DEVICE_FLOW_EXPIRY_GRACE_MS = 30_000` extracted** in
`DaemonAuthFlow.ts` (was a magic number on `start.expiresAt +
30_000`). `AwaitCompletionOptions.timeoutMs` doc now describes the
actual grace-past-expiry behavior + the rationale (clock skew +
daemon sweeper interval + network latency) instead of the wrong
"defaults to expiresAt - Date.now()" claim.
- **Explicit `chmod 0o600`** in `cacheQwenCredentials` after every
write. `fs.writeFile`'s `mode` only applies on file creation; a
pre-existing `oauth_creds.json` written under a broader umask kept
its old permissions across upgrades. The chmod now tightens it on
every write; chmod failure (Windows / hardened FS) surfaces via
`debugLogger.warn` instead of silently dropping the invariant.
- **`SharedTokenManager.clearCache()` failure now logs**
`debugLogger.warn` (was a silent `try { } catch { }`). In
production a swallowed clearCache means in-process callers serve
stale credentials until the SharedTokenManager mtime watcher
catches up — a recoverable degradation worth a log line.
- **Protocol doc** lists `persist_failed` in the
`auth_device_flow_failed.errorKind` union (was added to the type
but missed in the doc).
- **`pollDeviceToken({signal})`** plumbed through
`IQwenOAuth2Client` interface + `QwenOAuth2Client` impl + the Qwen
device-flow provider. Cancel / dispose during a slow IdP response
now aborts the in-flight HTTP socket immediately instead of
waiting for the upstream timeout. Two new registry tests assert
`cancel()` / `dispose()` propagate abort to the signal observed by
`provider.poll`.
- **`revealSecret` error message** clarified: was "secret has been
GC-evicted" (impossible — WeakMap doesn't evict reachable keys).
Now points at the actual reachable failure modes (forged shape /
serialize+reparse losing the WeakMap binding).
- **`transitionTerminal` JSDoc** clarifies that the PRIMARY guard
against late timer secret leaks is the `entry.status !== 'pending'`
check at the top of `runPollTick`; secret-clearing here is
defense-in-depth.
- **`DeviceFlowErrorKind` JSDoc'd per variant** so consumers can tell
when each fires (RFC 8628 distinctions + `persist_failed` vs
`upstream_error` boundary).
- **Stale "PR 16 / PR 21 §3" temporal references** in
`DaemonAuthFlow.ts:124` rephrased to be timeless ("workspace-scoped
events fan out through whatever session buses happen to be live"
— no PR number references that rot when those PRs merge).
## Not adopted (4 items, replied to in-thread)
- **`authWithQwenDeviceFlow` browser-launch separation** — correct
architectural advice but out of #4255 scope (would refactor a CLI
auth UX module that PR 21 only touched additively). Tracked as a
Wave 5 follow-up.
- **Copyright header year range** — repo-wide convention "2025"; not
introduced by this PR.
- **Spread `...(x ? {x} : {})` → `x: x ?? undefined`** — the two are
not semantically equivalent. The current form omits the key
entirely on falsy `x`; the suggested form always includes the key.
Tests assert object shape and would break under the change.
- **Eager `client.auth` getter** — public API boundary. Lazy
construction matches `DaemonSessionClient` precedent + saves the
module load for SDK consumers that never touch auth.
Refs: #4175 #4255
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-1 review feedback
15 items from @wenshao's review batches on #4255. Catches a handful
of real bugs that the earlier round (commit 3d9f082f5) didn't
surface.
## Critical fixes
- **C1 — `pollUntilTerminal` providerId pass-through**
(`DaemonAuthFlow.ts:185`). The synthetic 404 fallback hardcoded
`providerId: 'qwen-oauth'`; the parent `awaitCompletion` already
receives the real providerId via `start.providerId` but
`pollUntilTerminal`'s parameter type stripped it. Add the field to
the param type, propagate.
- **C2 — open `errorKind` allowlist** (`events.ts`). The closed
5-value union in the type guard silently dropped any `failed`
event whose errorKind the daemon added without mirroring SDK-side
(e.g. a future `rate_limited`). The flow's reducer state would
never transition to terminal, leaving SDK consumers stuck on
`pending` forever. Open the union with `(string & {})` and accept
any non-empty string in the runtime guard. Updated test asserts
forward-compat behavior + still rejects the truly-malformed
empty-string case.
- **C3 — `persist()` timeout + signal**
(`deviceFlow.ts`). A wedged disk I/O (NFS stall, encrypted-volume
contention) without bounds would pin the entry in `pending` until
the upstream `expires_in` elapsed (potentially minutes). The
registry now passes its `cancelController.signal` AND arms a hard
`DEVICE_FLOW_PERSIST_TIMEOUT_MS = 30_000` timer; persist failure
surfaces as `persist_failed` immediately. The
`DeviceFlowPollResult` `success` variant signature changed to
`persist({signal})`.
- **C4 — cancel × success race rollback**
(`deviceFlow.ts` + Qwen provider). Today, if `cancel()`
transitions while `persist()` is in flight, the credentials get
written but the flow's status is `cancelled`. User sees cancelled,
daemon disk has a valid token. `DeviceFlowPollResult.success`
gains an optional `unpersist()` callback the registry calls when
`transitionTerminal(authorized)` fails — the Qwen provider wires
it to `clearQwenCredentials()`. Rollback failure is audited but
not propagated (re-running auth would overwrite anyway).
- **C5 — don't `unref()` the `awaitCompletion` sleep timer**
(`DaemonAuthFlow.ts`). On a standalone Node CLI/script doing just
`client.auth.start().awaitCompletion()`, the unref'd between-poll
timer was the only event-loop handle, so Node could exit before
the user finished authorization. The poll wait is foreground work
the caller explicitly awaits — keep it ref'd.
## Information-leak fixes
- **S1 — sanitize `persist_failed` hint**. `err.message` from
`cacheQwenCredentials` embeds the full `~/.qwen/oauth_creds.json`
path. Broadcast via SSE, that path leaks the daemon's home layout
to every connected session subscriber. Replace user-facing hint
with `"credentials could not be written to the daemon filesystem
— check disk space and permissions"`; full err goes to stderr
audit only.
- **S2 — sanitize upstream `pollDeviceToken` hint**. The class
embedded the entire raw IdP response body (which can be an HTML
error page from a reverse proxy) into the thrown message. Same
broadcast leak path. Replace upstream-error hint with
`"unexpected response from identity provider"`; RFC 8628 errors
use `"Qwen IdP returned ${kind}"`.
## Cleanup / forward-compat
- **D1 — drop duplicate `clearCache()`** at `qwenOAuth2.ts:840`. The
paired call became redundant once `cacheQwenCredentials` folded
the clearCache in (PR #4255 fold-in 1). The fold-in 1 message
said this would be done; the duplicate slipped through.
- **S3 — drop unused `DeviceFlowNotFoundError`** (`deviceFlow.ts`).
Exported but never imported; route handlers do inline 404 JSON.
- **S4 — single-source SDK status / errorKind unions**
(`types.ts`). `DaemonAuthDeviceFlowSdkStatus` /
`DaemonAuthDeviceFlowSdkErrorKind` were parallel literal copies
of the canonical events.ts definitions — drift waiting to happen.
Now imported + aliased as type-only re-exports.
- **S5 — broadcast 100% fail elevates to stderr**
(`httpAcpBridge.ts`). Per-session bus failures stay debug-only,
but a broadcast where EVERY session bus refused is operationally
interesting (clients won't see the event). Track success / fail
counts; `writeStderrLine` when `successCount === 0`.
- **S6 — `this.disposed` check after `await provider.start()`**
(`deviceFlow.ts`). `dispose()` mid-start would orphan the freshly-
inserted entry (`schedulePoll` guards on `disposed` so no poll
fires; the entry never transitions). Throw post-await if disposed.
- **W1 — thread `signal` into `requestDeviceAuthorization`**
(`qwenOAuth2.ts` + Qwen provider). `start()` had the same
cancellation gap that `pollDeviceToken` had — a slow
device-authorization request couldn't be aborted during shutdown.
Now plumbed end-to-end.
- **W2 — split `invalid_request` from `unsupported_provider`**
(`server.ts`). Conflating them surfaced misleading remediation
hints to SDK consumers branching on `code` ("this provider isn't
supported here" when the real cause was a serializer dropping the
field). Bad-shape now returns `code: 'invalid_request'`;
unknown-but-well-formed stays `unsupported_provider`.
- **W3 — drop never-populated `accountAlias`**
(Qwen provider). The field was wired through types / events /
reducer / audit but the Qwen IdP's token response doesn't carry
one (no `name` / `email` / `sub`). Returning only `{expiresAt}`
makes the field type-honestly absent rather than always-undefined.
Future provider with an alias-bearing response can populate it.
- **W4 — `DaemonAuthFlow` JSDoc accuracy**. Doc claimed "first
attempts to consume an SSE event stream … falls back to GET-based
polling"; actual is GET-only with SSE as a real-time hint for
clients already subscribed to a session stream.
- **W5 — clearer unit arithmetic** in interval normalization. The
`(_INTERVAL_MS / 1000) * 1000` cancelation hid the s↔ms boundary;
expanded form makes both branches unit-explicit.
## Test changes
- `daemonEvents.test.ts` updated to match the now-OPEN errorKind
union (forward-compat assertion + empty-string still rejected).
- `deviceFlow.test.ts` `FakeProvider.poll` aligned with the new
`persist({signal})` signature + optional `unpersist`.
## Validation
- `npm run typecheck --workspace packages/cli --workspace
packages/sdk-typescript --workspace packages/core` — clean
- `npx vitest run packages/cli/src/serve/
packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 368/368
- `npx eslint --max-warnings 0` over the 11 PR 21 surface files —
clean
Refs: #4175 #4255
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-2 review feedback
10 new threads from @wenshao's second deep-review pass on #4255.
Verified status: 5 real issues, 1 improvement, 3 stale (already
fixed; comments lagged), 1 false alarm (typecheck demonstrably
clean).
## Critical fixes
- **fold-in 2 C4 REVERSED**: when `provider.poll()` returns success
AND `cancel()` / `dispose()` transitioned the entry mid-`persist()`,
the registry now FORCES the entry to `authorized` and keeps the
on-disk credentials. The earlier rollback (`unpersist()`) wasted
the user's IdP approval because the RFC 8628 `device_code` is
single-use — re-running the flow would force them through the
whole browser-prompt + paste-code dance again for a click whose
intent was likely "stop the wait" rather than "undo my already-
completed approval". Aligns with gh CLI / Auth0 SDK / git-
credential-manager. Audit captures the race via `hint:
'lost_success_kept ...'`. `DeviceFlowPollResult.success.unpersist`
field + Qwen provider's `clearQwenCredentials` rollback removed.
- **#1 GET /workspace/auth/device-flow/:id strict gate**: this GET
surfaces `userCode` / `verificationUri` for pending entries, which
on the loopback no-token default were readable by any local
process. POST + DELETE were already strict; aligning GET closes
the information-disclosure asymmetry. `/workspace/auth/status`
stays bearer-only (its `pendingDeviceFlows` entries intentionally
omit `userCode`).
- **#2 `inFlightStarts` hard timeout**: a hung `provider.start()`
(network partition, unresponsive IdP) used to leave the per-
`providerId` slot in `inFlightStarts` occupied forever, blocking
every subsequent POST until daemon restart. New
`DEVICE_FLOW_START_TIMEOUT_MS = 30_000` arms a timer that
`cancelController.abort()`s the start; the rejected promise
unwinds through the `try/finally` clearing the slot.
- **#10 chain-completing the C3 persist-timeout**: the earlier C3
fix armed a 30s timer that fired `cancelController.abort()` then
`await result.persist({signal})`, but the chain ended at the
registry boundary — `cacheQwenCredentials` didn't take a signal,
so `fs.writeFile` couldn't be aborted. Now `cacheQwenCredentials`
accepts an optional `{signal}` and threads it into
`fs.writeFile(..., {signal})` (Node native). The Qwen provider's
`persist({signal})` forwards the entry's
`cancelController.signal` end-to-end.
## Improvement (#4): 404 fallback errorKind
`pollUntilTerminal`'s 404 catch used to synthesize
`{status: 'expired'}` for ALL evicted entries — conflating "your
flow expired during your disconnect", "the daemon was restarted",
and "your deviceFlowId was wrong". Now returns
`status: 'error'` + `errorKind: 'not_found_or_evicted'` + a `hint`
so SDK consumers branching on errorKind can distinguish.
## Information leak (#9): start() path raw IdP message
S2 (fold-in 2) sanitized `poll()`'s upstream-error hint, but
`start()` still embedded the raw `err.message` (full IdP response,
potentially HTML from a reverse proxy / WAF) into the
`UpstreamDeviceFlowError` that flowed to SDK clients via the 502.
Now uses static messages for the SDK-visible errors; raw detail
goes through `writeStderrLine` for operator audit only. Mirrors
S2's approach.
## Stale comments cleaned (#5, #7)
`qwenDeviceFlowProvider.ts:177` claimed
`cacheQwenCredentials` "doesn't currently take a signal — that's
a follow-up". After #10 above, that's no longer true; the comment
is replaced with the actual end-to-end signal-threading note.
## Not adopted (1 false alarm)
- Thread on `types.ts:330` claimed type-only-import-after-
declarations breaks `tsc` and fails `daemonEvents.test.ts:670`
with TS2345. Demonstrably false: `npx tsc -p
packages/sdk-typescript/tsconfig.json --noEmit` exits 0;
`daemonEvents.test.ts` is the post-fold-in-2 file with the
open-allowlist assertion (test 28/28 passes). The reviewer may
have been looking at a transient state during their analysis.
## Validation
- `npm run typecheck --workspace packages/cli --workspace
packages/sdk-typescript --workspace packages/core` — clean
- `npx vitest run packages/cli/src/serve/
packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398
pass
- `npx eslint --max-warnings 0` over the PR 21 surface — clean
Refs: #4175 #4255
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-3 review feedback
5 new threads from the third deep-review pass on #4255. 3 real
issues fixed; 1 stale (already done in fold-in 3); 1 deferred as
non-blocking design suggestion.
- **A — `expiresIn` / `interval` non-finite guard**
(`deviceFlow.ts`). The provider contract types both as `number`,
but a misbehaving / future provider could hand `undefined` /
`NaN` / `Infinity`. `Math.max(0, NaN) * 1000` is `NaN`, then
`now() + NaN` is `NaN`, then `now >= NaN` is always `false` —
the sweeper would NEVER evict the entry, pinning an upstream
`device_code` slot until daemon restart. Same hazard on
`interval * 1000` (NaN → `setTimeout(NaN)` fires immediately,
Infinity → scheduler clamps to TIMEOUT_MAX). Now both fields go
through `Number.isFinite(x) && x > 0`; missing/bad values fall
back to RFC 8628's recommended ceilings (10 min for expiry, 5s
for interval).
- **D — typed `app.locals` accessor**
(`deviceFlow.ts` + writer/reader call sites). The
`app.locals['deviceFlowRegistry']` string key was shared between
`createServeApp` (writer) and `runQwenServe` (reader); a typo on
either side would compile cleanly and the shutdown dispose call
would silently no-op, leaving polling timers running until the
`unref()` rescue. New `setDeviceFlowRegistry(app, registry)` /
`getDeviceFlowRegistry(app)` pair gives both call sites
type-checked access; the string literal is encapsulated in one
module.
- **E — `UnsupportedDeviceFlowProviderError` docstring**
(`deviceFlow.ts`). After fold-in 2's W2 fix split
`invalid_request` from `unsupported_provider`, the route layer
screens unknown ids against `DEVICE_FLOW_SUPPORTED_PROVIDERS`
before reaching the registry — so this error is now reachable
ONLY on a daemon-internal invariant violation (id is declared
supported but not registered in the runtime provider map).
Docstring + thrown message updated to reflect that this branch
signals a programmer error, not user input.
- **B** claimed `cacheQwenCredentials(credentials)` doesn't forward
signal to `fs.writeFile`. Verified: fold-in 3 (#10) at
`qwenDeviceFlowProvider.ts:204` calls
`cacheQwenCredentials(credentials, { signal: persistOpts.signal })`
and the core helper threads it into `fs.writeFile(..., {mode,
signal})`. The reviewer was looking at the comment block above
(lines 174-181) without scrolling to the actual call site.
- **C — SDK `cancelDeviceFlow` lossy 204/404 collapse**.
Suggested returning `{existed: boolean; alreadyTerminal: boolean}`
instead of resolving void on both 204 and 404. Real signal-loss
but tagged "[非阻塞]" by the reviewer; changing requires a
daemon route shape change (200 + body instead of 204) which is
better as a focused follow-up PR. Acknowledged in-thread;
deferred to a fold-in PR after #4255 lands.
- `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}`
- `npx vitest run packages/cli/src/serve/
packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398
- `npx eslint --max-warnings 0` over the PR 21 surface — clean
Refs: #4175 #4255
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-4 review feedback
4 threads from the fourth review pass on #4255. 3 adopted + 1
deferred (out-of-scope rename of PR 15's `mutate` helper).
## Adopted
### #1 — `persistInFlight` flag suppresses cancel × persist event-stream UX trap
When `provider.poll()` returns success and we await `persist()`, a
concurrent `cancel()` would synchronously transition the entry to
`cancelled` and emit `auth_device_flow_cancelled` — then `persist()`
resolves and (per fold-in 3 C4) force-overrides to `authorized` +
emits `auth_device_flow_authorized`. The reducer state correctly
last-write-wins on `authorized`, but DIRECT event-stream consumers
(close-dialog handlers, telemetry, UI cleanup) race onto an unmounted
UI when the second event lands.
Now: while persist is in-flight, `cancel()` and the sweeper SKIP the
state transition + event emit. They register intent (set
`cancelRequestedDuringPersist=true` for cancel; sweeper just no-ops)
and let the persist resolution decide:
- persist succeeds → `authorized` (IdP wins per fold-in 3 C4)
- persist fails AND cancel was requested → `cancelled`
- persist fails AND `now >= expiresAt` → `expired` / `expired_token`
- persist fails otherwise → `error` / `persist_failed`
Result: at most one terminal event per flow. Imperative SSE
consumers no longer see oscillating terminal states. Audit captures
the race (`hint: 'lost_success_kept ...'`) for incident-response
correlation.
### #2 — `revealSecret` → `unsafeRevealSecret` rename
The earlier JSDoc claimed "the `unsafeReveal_` naming is intentional:
greppable in code review, easy to allowlist in lint rules, hard to
invoke by accident" — but the actual function was named
`revealSecret`. The promised safety properties didn't exist; a code
reviewer wouldn't single out `revealSecret` as suspicious, and a
`no-restricted-syntax` ESLint rule wouldn't flag it.
Renamed to `unsafeRevealSecret` so the JSDoc-promised "greppable" /
"lintable" property is now actually true. Two call sites in the
Qwen provider + 4 test references updated. Internal symbol; not
exposed through the SDK package.
### #4 — `QwenOAuthPollError` typed class replaces substring regex
The earlier RFC 8628 error mapper used an anchored regex against the
thrown error message text — an implicit cross-file string contract
between `qwenOAuth2.ts` (throws) and `qwenDeviceFlowProvider.ts`
(matches). If `qwenOAuth2.ts` ever changed its message format, ALL
RFC 8628 errors (`expired_token` / `access_denied` / `invalid_grant`)
would silently fall through to `upstream_error` — wrong errorKind
flowing through telemetry with no test or type-system check to catch
the drift.
Now `QwenOAuth2Client.pollDeviceToken` throws a structured
`QwenOAuthPollError extends Error` with `oauthError` / `description`
/ `status` fields. The provider branches on `instanceof
QwenOAuthPollError` and reads `.oauthError` directly via a
dedicated `mapRfc8628OAuthCode(code)` switch. The drift hazard is
gone: a future code change that touches the typed class will
fail tsc until both sides are updated. Message format preserved
for any pre-existing log-parsing / substring matchers.
## Not adopted
### #3 — `mutate({strict:true})` semantic awkwardness on GET
Reviewer correctly noted that `mutate` is named for state-changing
routes, but `GET /workspace/auth/device-flow/:id` uses it for an
information-disclosure defense (only reachable code path is reading
state). Suggested rename: `mutate` → `strictHttpGate`.
Deferred: the rename touches PR 15's helper which has many call
sites in `server.ts` and is shared infrastructure for Wave 4 PRs
17/19/20. PR 21 is the first / only consumer of the strict-on-GET
form so far; widening the rename to a Wave 4 follow-up keeps the
fold-in scope tight. Replied in-thread.
## Validation
- `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}`
- `npx vitest run packages/cli/src/serve/
packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 544/544
- `npx eslint --max-warnings 0` over the PR 21 surface — clean
Refs: #4175 #4255
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-5 review feedback
Five small adopt items from the round-5 review pass; one stale thread
already addressed in b5b77ee90 (fold-in 5).
#2 — `as const` + derived type for DEVICE_FLOW_SUPPORTED_PROVIDERS so
adding/removing a provider id requires touching exactly ONE site.
Mirrors `SERVE_ERROR_KINDS` / `ServeErrorKind` in `status.ts`.
#3 — Clarify `DEVICE_FLOW_EXPIRY_GRACE_MS` JSDoc to distinguish the
daemon's 30s SWEEP cadence (what the grace tracks) from the 5-min
TERMINAL_GRACE_MS reconnect window (which awaitCompletion does NOT
need to wait through).
#4 — Add `@remarks` block on `DeviceFlowProvider.poll()` warning
future provider authors that thrown `err.message` flows verbatim
into the SSE-broadcast `auth_device_flow_failed` hint, and must be
sanitized. Two equally-correct paths documented (typed `error`
result vs sanitized thrown message).
#5 — Truncate raw IdP detail in `qwenDeviceFlowProvider.ts` stderr
audit lines to 2 KiB. WAFs / reverse proxies can return MB-sized
HTML error pages, and container log aggregators (Loki, Fluent Bit,
Stackdriver) typically truncate or drop lines past 4-32 KiB —
losing the useful prefix downstream. 2 KiB retains structured JSON
envelopes while staying well below every aggregator's per-line cap.
#6 — Track latest `originatorClientId` on per-provider singleton
take-over via new `entry.lastOriginatorClientId` field +
`recordTakeover()` helper. When a second SDK client posts
`POST /workspace/auth/device-flow` for an already-pending provider
(or one being created in `inFlightStarts`) with a different
`initiatorClientId`, an audit breadcrumb records the take-over so
incident response can correlate "client A started, client B took
over at 12:34". Event-routing intentionally still uses the original
`initiatorClientId` (events are workspace-broadcast and changing
the originator field mid-flow would break SDK reducers that key on
it). Two new tests cover the differing-id audit + same-id no-op.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-6 review feedback
Six "Critical" findings from a gpt-5.5 /review pass — all real
liveness/correctness defects in the daemon's auth device-flow path
and the SDK's awaitCompletion polling loop.
#1 — Make `provider.start()` timeout authoritative via `Promise.race`
in `DeviceFlowRegistry.doStart`. The earlier shape only ABORTED the
signal on timeout; a provider that ignores `signal` (non-abortable
I/O, future implementer who forgets to thread it to `fetch`) would
leave the `await` hanging until daemon restart, pinning the
`inFlightStarts` slot for that providerId. Race against a rejecting
timer makes the timeout authoritative regardless of provider
cooperation; abort still fires for cooperative cleanup.
#2 — Same shape for `result.persist()` in the success branch of
`runPollTick`. A future provider whose persist performs
non-abortable steps (mkdir/chmod/mv outside the abortable
fs.writeFile path) would otherwise hang the poll tick until process
restart. Race against rejecting timer; rejection maps to
`persist_failed`.
#3 — Clamp `expiresIn` and `interval` upper bounds. Previous
`Number.isFinite + > 0` guards stopped NaN/Infinity but a finite
extreme like `1e12` was still accepted — pinning the per-provider
singleton for ~30,000 years (`expires_in`) or scheduling a
TIMEOUT_MAX-clamped poll that never fires within `expiresAt`
(`interval`). Two new constants (`DEVICE_FLOW_MAX_EXPIRES_IN_SEC =
3600`, `DEVICE_FLOW_MAX_INTERVAL_MS = 60_000`) cap the worst case.
#4 — Extract `getDeviceFlowOrSynthetic404(...)` helper in
`DaemonAuthFlow.ts` and route BOTH the loop body and the
timeout-ceiling final read through it. Previously the ceiling read
went directly through `client.getDeviceFlow` and a 404 at the
boundary (entry evicted just as the timeout fired) would reject with
`DaemonHttpError(404)` instead of returning the structured `{ status:
'error', errorKind: 'not_found_or_evicted' }` that the rest of
`awaitCompletion` promises.
#5 — Validate `AwaitCompletionOptions.timeoutMs` and `pollOverrideMs`
with `Number.isFinite + > 0`. NaN slipped past the previous `??
default` form (NaN is truthy-ish in that position) and produced a
`ceiling` of `NaN` (loop runs forever — `now >= NaN` always false)
or a `setTimeout(NaN)` (Node clamps to 1ms — tight polling loop).
Sanitize to `undefined` so the documented defaults take effect.
#6 — Thread `signal` into `DaemonClient.getDeviceFlow` and forward
to `fetchWithTimeout` (which already composes caller + timeout
signals). awaitCompletion now passes `opts.signal` from both GET
sites. Without this, an `awaitCompletion` caller that aborts mid-
poll could not cancel an in-flight stalled GET; it would have to
wait for the daemon-side `fetchTimeoutMs` (30s default) to fire.
Four new tests in `deviceFlow.test.ts` pin the new behaviors:
hanging-start timeout (#1), hanging-persist → persist_failed (#2),
extreme-expiresIn clamp (#3), extreme-interval clamp (#3).
FakeProvider gained a `startHangs` flag for the non-cooperative
provider scenario.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-7 review feedback
Two findings from a DeepSeek /review pass; both small but legitimate
defense-in-depth gaps.
#1 — `runPollTick`'s catch block forwarded `err.message` verbatim
into the SSE-broadcast `hint`. The provider's `@remarks` contract
(fold-in 6 #4) requires throwers to sanitize, but if violated the
unbounded raw payload would reach every SSE subscriber. Added
`DEVICE_FLOW_POLL_HINT_MAX_LEN = 256` + `truncatePollHint()`,
applied to the catch's `result.hint`. Full raw `err.message` is
still routed to the audit trail (`audit?.record({hint: 'provider.poll()
threw (raw): ...'})`) so operator visibility for incident response
is preserved. Belt-and-suspenders: the contract is now structurally
enforced rather than relying on every future provider author to
read the JSDoc.
#2 — `updateMatchingFlow` (and the `started`/`authorized` handlers
in `reduceDaemonAuthEvent`) unconditionally overwrote state without
comparing `rawEvent.id` against the existing flow's
`lastSeenEventId`. The field's JSDoc documented it as a monotonic
counter to prevent stale frames from overwriting newer state, but
the code didn't enforce that contract. SSE reconnect with
`Last-Event-ID < terminal-frame-id` would replay older frames; if
any of them were for the same `deviceFlowId` (e.g. a delayed
`failed` arriving after `authorized`) the stale frame would
overwrite the terminal. Daemon-side `transitionTerminal` makes the
exact reachable scenario thin, but the documented contract should
match the code.
Threaded `rawEventId` into `updateMatchingFlow` and added the gate
there + in the `started` and `authorized` handlers (the two cases
that don't go through `updateMatchingFlow`). Synthetic frames
without an envelope `id` (`rawEventId === undefined`) bypass the
gate — they originate inside SDK reducer machinery and aren't
subject to replay ordering.
Three new tests pin the contracts:
- `runPollTick catch truncates the SSE hint and preserves raw on
the audit (fold-in 8 #1)` — `pollThrowsWith` flag on FakeProvider
models a non-conforming provider; SSE hint < 400 chars + contains
'truncated'; audit hint contains the full 4_000-char raw.
- `reduceDaemonAuthEvent rejects out-of-order frames (fold-in 8 #2
monotonicity)` — stale `failed`(id=7) does NOT overwrite
`authorized`(id=10); stale `started`(id=4) for a different flow
also rejected.
- `reduceDaemonAuthEvent passes synthetic frames (no envelope id)
through the gate` — SDK-internal frames without `id` are honored.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-8 review feedback
Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5
review pass. Tests deferred to fold-in 10 (separate, larger commit).
CRITICAL CORRECTNESS
#7 — `provider.persist()` Promise.race could publish `persist_failed`
to SSE while a non-cooperative provider was still committing
credentials to disk. Added an independent tracker on the original
persist promise: if the race timed out (`persistTimedOut === true`)
AND the underlying persist later resolved successfully, audit a
`lost_success_after_timeout` breadcrumb so operators see the
inconsistency. Tightened the persist `@remarks` contract to require
signal honoring end-to-end. Qwen provider already complies (fold-in
3 #10); this is forward-defense for future providers.
#11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`,
`createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event /
data / state types) was re-exported from `src/daemon/index.ts` but
NEVER from the published SDK entry `src/index.ts`. SDK consumers got
`undefined` for everything except `client.auth.start()` (which
traveled through the already-exported `DaemonClient`). Added the
missing exports and pinned via `daemon-public-surface.test.ts`.
#12 — `core/src/qwen/qwenOAuth2.ts:373`'s
`debugLogger.debug('Device authorization result:', result)` writes
the raw `device_code` (RFC 8628 bearer-equivalent credential) to
stderr / journald, bypassing the `BrandedSecret` redaction layer.
Pre-existing on main but PR 21 expanded the exposure surface.
Sanitized to log only `{ ok, expires_in }` on success / `{ ok,
error }` on error.
#13 — `runPollTick` success-branch persist-failure × past-`expiresAt`
classified as `expired_token` instead of `persist_failed`, routing
operators toward "tell user to retry" (RFC 8628 expiry) when the
actual root cause was disk I/O. Reclassified to `persist_failed`
with a `persist_also_failed_past_expiry` audit hint to preserve the
timing detail for incident response.
SMALL CORRECTNESS
#1 — `runPollTick` catch hint replaced with a STATIC bounded message
("provider.poll() failed; see daemon audit log for details"). The
fold-in 8 truncated-prefix approach could still leak the first 256
chars of provider-templated raw text including secret material. Full
raw still routed to audit channel for operator visibility.
#5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred-
cancel branch in `cancel()` now stamps it on the entry, and the
persist-resolution `cancelled` event publish uses
`entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers
that suppress self-emitted events can now attribute the cancel
correctly.
#6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented
"settle immediately, return current daemon view" contract) was
treated as falsy by the `?` ternary, falling back to the default.
`sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling
computation uses `!== undefined` instead of truthy check.
#8 — `EventBus.publish()` returns `undefined` for closed buses (it
does NOT throw). `broadcastWorkspaceEvent` previously counted that
path as success, hiding the all-buses-dropped operator alarm.
Folded the closed-bus-as-failure check into the canonical
`publishWorkspaceEvent` (see #X below).
#9 — start-timeout Promise.race rejected with a plain `Error`,
falling through `sendBridgeError` to a generic 500. Switched to
`UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502
(matching the envelope every other IdP start failure uses).
STRUCTURAL
#3 — Three identical `transitionTerminal + publish + audit`
expired_token blocks in `runPollTick`/`sweep`/(removed by #13)
deduplicated into a private `expireEntry()` helper. Future event-
shape changes are now a one-edit operation.
#X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline
comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent`
was kept distinct only to avoid the merge conflict; once PR 16
landed, it became a fold-in candidate. Folded the closed-bus +
all-failed-stderr-escalation operator-visibility features (PR 21's
S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped
`broadcastWorkspaceEvent` from the bridge interface + impl + test
mocks. PR 21's deviceFlowEventSink now calls
`bridge.publishWorkspaceEvent` — single canonical workspace fan-out.
DOC
#16 — Added a "Cross-client take-over" paragraph to
`docs/users/qwen-serve.md` explaining that two clients on the same
daemon for the same provider get the per-provider singleton with
`attached: true`/`false` distinguishing them; no separate event
fires (both eventually observe the same `auth_device_flow_authorized`).
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao round-9 review feedback
Two small non-blocking items from the round-9 pass; defensive shape +
docs only. The 4 deferred test-coverage threads (#1-4 of round-8) are
still tracked for fold-in 10.
#6 — `lastSeenEventId` typed `number` with `?? 0` defaults in the
`auth_device_flow_started` reducer case. The daemon-side `EventBus`
assigns ids ≥ 1 so the `0` sentinel has no real-traffic meaning, but
the monotonic gate (`rawEventId <= flow.lastSeenEventId`) would
reject any future SDK-internal synthetic frame using `id: 0`.
Changed the field type to `number | undefined` and dropped the
`?? 0` from the started case. The `updateMatchingFlow` /
`auth_device_flow_authorized` guards already short-circuit on
`existing.lastSeenEventId !== undefined`, so undefined is safe
end-to-end. Existing 34 reducer tests still pass unchanged.
#7 — Added `@remarks` block to `DeviceFlowErrorKind.persist_failed`'s
JSDoc explaining the lost-success retry UX. When fold-in 9 #7's
`lost_success_after_timeout` audit fires (non-conforming provider
violates signal contract; disk write succeeds after registry
published `persist_failed`), a naive SDK retry hits the IdP a
second time with a fresh `device_code` and prompts the user
twice — but the first credential set is already valid. JSDoc now
documents the mitigation: SDK consumers writing retry logic on
`persist_failed` should call `client.auth.getStatus()` BEFORE
re-prompting; operators can grep stderr/audit for
`lost_success_after_timeout` to detect occurrences.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* test(serve): fold-in 10 — auth device-flow test bundle (#4255)
Lands the four deferred test-coverage items the round-8 review
flagged (and round-9 re-surfaced) as a hard merge prerequisite.
Net +41 tests across registry / SDK helper / client HTTP /
HTTP route layers.
#1 — `deviceFlow.test.ts` `persist failure paths` describe (3
tests, +3). The success arm's three terminal mappings — pure
`persist_failed`, `cancelled` (cancel during persist), and
`persist_failed` past `expiresAt` (the fold-in 9 #13
reclassification with `persist_also_failed_past_expiry` audit
hint) — were 0-covered. Now pinned. Test #2 also asserts the
fold-in 9 #5 cancellerClientId routing on the deferred
`cancelled` event.
#2 — new `DaemonAuthFlow.test.ts` (+14 tests). Mock DaemonClient
with sequenced `getDeviceFlow` replies. Covers happy-path
polling → `authorized`; `slow_down`-driven `intervalMs` bump
firing `onThrottled`; `signal.abort()` rejection; `signal`
propagation through `client.getDeviceFlow` (fold-in 7 #6);
`timeoutMs` ceiling final-read; `timeoutMs:0` immediate-return
(round-9 #6); NaN/Infinity → `sanitizePositiveMs` fallback to
default ceiling (fold-in 7 #5); 404 → synthetic
`error`/`not_found_or_evicted` (fold-in 3 #4) at BOTH the loop
body AND the timeoutMs ceiling read (fold-in 7 #4); non-404
DaemonHttpError rethrown; `cancel()` and top-level
`status()`/`cancel()` wrappers forward correctly.
#3 — `DaemonClient.test.ts` `device-flow methods` describe
(+11 tests). POSTs `/workspace/auth/device-flow` happy path +
clientId header + body shape; 200/201 acceptance; non-2xx →
`DaemonHttpError`. GETs URL-encode the deviceFlowId; forward
`opts.signal` to `fetchWithTimeout`'s composed signal (fold-in
7 #6 — verified by aborting caller signal and observing the
fetch's signal flip to `aborted`); 404 throws. DELETEs
swallow 204 + 404 (idempotent, mirrors `closeSession`); non-
204/404 throws. `getAuthStatus` plain GET. `client.auth`
lazy-instantiated singleton.
#4 — `server.test.ts` 5 supplementary contract tests (+5).
The existing 8 `it()`s cover happy paths + take-over + 401
POST + DELETE pending/terminal/unknown + 502 upstream + sweeper.
This commit plugs gaps: 400 `invalid_request` for missing /
non-string providerId (fold-in W2 split this from
`unsupported_provider`); 409 `too_many_active_flows` (via
injected fake registry); 401 `token_required` on DELETE
without bearer; the asymmetric GET posture
(`/workspace/auth/device-flow/:id` IS strict-gated to prevent
peer-process userCode shoulder-surf; `/workspace/auth/status`
stays read-only because its `pendingDeviceFlows` entries
intentionally redact `userCode`).
Validation: cli serve 631/631 (+8 from #1, #4); sdk 384/384
(+25 from #2, #3, +/- some pre-existing churn). Typecheck +
lint clean.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fix(qwen): atomic temp+chmod+rename in cacheQwenCredentials (PR #4255 round-11 #2)
gpt-5.5 /review flagged a real correctness/security gap: the
post-write `chmod` ordering left a window where freshly-written
credentials could land in a broadly-readable existing
`oauth_creds.json` before the chmod tightened it. On POSIX, a
chmod failure additionally degraded to a debug warning while the
broadly-readable tokens stayed on disk.
New shape mirrors the standard atomic-write idiom:
1. Write `${filePath}.tmp.${pid}.${randomUUID()}` with `mode: 0o600`.
The temp path doesn't exist beforehand, so the `mode` flag
actually applies on creation (it doesn't on existing files,
which was the root of the original race).
2. Defensive `chmod` on the temp file. POSIX failure is now a
HARD ERROR (refuses to publish broad-perm credentials to the
canonical filename). Windows logs a debug breadcrumb and
proceeds, since chmod is a no-op on most NTFS volumes (perms
go through ACLs).
3. Atomic `fs.rename` over `filePath`. The canonical path is
ALWAYS at `0o600` from the moment it contains the new tokens;
readers see either the old creds or the new creds, never a
partially-written or broadly-readable state.
4. Best-effort `fs.unlink` of the temp file on any failure path
so failed writes don't leave `.tmp.<pid>.<uuid>` litter on
disk.
Test mock in `qwenOAuth2.test.ts` extended with `chmod` + `rename`
no-op stubs so the existing 158 core/qwen tests still pass; no test
behavior change beyond the mock surface.
Validation: typecheck clean (cli + core + sdk-typescript); core
qwen 158/158; cli serve 643/643; sdk 384/384.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): address PR #4255 wenshao + gpt-5.5 round-12 review feedback
Eight findings from a wenshao + gpt-5.5 /review pass: 1 critical
correctness, 2 real defensive defects, 4 edge cases / minor
hardening, 1 test gap. All adopted.
CRITICAL CORRECTNESS
#1 CzSpN — `dispose()` race: after `await provider.poll(...)` the
post-await guard checked only `entry.status !== 'pending'`, NOT
`this.disposed`. `dispose()` clears the registry maps and aborts
the entry's signal but doesn't mutate `entry.status`, so a
provider whose poll already resolved (or doesn't honor abort) could
enter the success branch and call `result.persist({...})` —
committing credentials on a shutting-down daemon. Added the
`if (this.disposed) return;` guard symmetric with the top-of-method
check.
REAL DEFENSIVE DEFECTS
#2 Cy_ZG — sync-throw escape: the `result.persist({signal})` call
happens BEFORE the `new Promise` constructor that captures it
(`persistTracker` is closed-over inside the constructor). A
non-conforming provider whose persist throws synchronously (e.g.
top-of-function validation) would escape past the outer
`try/catch (await new Promise(...))` and become an
`unhandledRejection` since `runPollTick` is fire-and-forget via
`void`. Wrapped the persist invocation in a try/catch that routes
the sync-throw into the same `persistError` branch.
#3 CzSpe — runtime provider map: provider validation hardcoded
`DEVICE_FLOW_SUPPORTED_PROVIDERS` even though `deps.deviceFlowProviders`
is the documented extension hook for tests/future providers.
Switched both POST validation and `/workspace/auth/status`
`supportedDeviceFlowProviders` to derive from
`deviceFlowProviderMap.keys()` — single source of truth matches
the registry's `resolveProvider`.
EDGE CASES / MINOR HARDENING
#4 Cy_Y9 — `slow_down` re-clamp: `intervalMs += SLOW_DOWN_BUMP_MS`
can push past `DEVICE_FLOW_MAX_INTERVAL_MS` (the bound that keeps
`setTimeout` from clamping to TIMEOUT_MAX). Wrapped in
`Math.min(MAX_INTERVAL_MS, ...)` symmetric with the doStart clamp.
#5 Cy_ZF — `expiresInSec` lower bound: `0.5` was finite-positive
and produced `expiresAt = now() + 500 ms` — first poll (clamped at
≥1 s) fires AFTER expiresAt → flow expires before any user could
authorize. Added `DEVICE_FLOW_MIN_EXPIRES_IN_SEC = 30` (RFC 8628
§3.2 calls 5–30 minutes "reasonable"; sub-30s is non-compliant).
#6 CzHOK — take-over response privacy: `initiatorClientId` was
echoed to ANY take-over POST caller, including those with no
`X-Qwen-Client-Id` header. Bearer-gated already, but the
asymmetry "anonymous caller learns who started it" violated the
no-header-as-privacy-signal contract. Now only echoed when the
caller's id matches the entry's initiator.
#7 CzSpd — production audit visibility: production audit sink
dropped `line.hint`, but the registry uses hints for operator-only
breadcrumbs (`provider.poll() threw (raw)...`,
`lost_success_after_timeout`, `persist_also_failed_past_expiry`,
take-over correlation, `deferred (persist in flight; ...)`). The
documented troubleshooting trail was invisible in production
stderr. Now included with a 1 KiB bound + JSON-quoted so multi-
word hints stay parseable.
TEST GAP
#8 Cy_ZH — `lost_success_after_timeout` audit: the
fold-in 9 #7 split-brain detector for non-cooperative providers
had no test pinning it. Added a controllable `latePersist` Promise
+ test that drives poll → success → enters persist race → fires
PERSIST_TIMEOUT (registry publishes persist_failed) → resolves
persist late → asserts the lost_success audit fires.
Validation: typecheck + lint clean; cli serve 644/644 (+1 from
the new test); sdk-typescript 384/384.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
* fixup(serve): close concurrent multi-provider cap bypass (PR #4255 round-13 #1)
gpt-5.5 /review caught a real workspace-wide cap bypass:
`countActive()` only counted entries already installed in
`byProvider`, but the cap check at the top of `start()` runs
before any provider's `inFlightStarts` slot completes
`provider.start()`. A burst of fresh starts for
`DEVICE_FLOW_MAX_CONCURRENT + 1` distinct providers all run
synchronously to the cap check (each `start()` is async but
runs to its first await — the await happens AFTER the cap
check), all observe `count === 0` (no `byProvider` entries
installed yet), and all pass — eventually installing more
than the documented four pending flows.
Fix: include `inFlightStarts.size` in `countActive()`. The
two maps are disjoint by construction (the existing-pending
fast-path catches any provider with both), so simple
addition cannot double-count. The second concurrent caller
sees count=1, the third count=2, …, and the (MAX+1)th caller
is rejected with `TooManyActiveDeviceFlowsError`.
Test: `caps at DEVICE_FLOW_MAX_CONCURRENT under CONCURRENT
distinct-provider starts`. Fires `MAX+1` concurrent starts
via `Promise.allSettled`, asserts exactly `MAX` fulfilled +
exactly 1 rejected with the typed error. Pre-fix this test
fails (all `MAX+1` succeed); post-fix it passes.
Validation: typecheck clean across all 4 workspaces;
deviceFlow.test.ts 35/35 (was 34); cli serve 645/645.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
8 tasks
xaelistic
pushed a commit
to xaelistic/qwen-code
that referenced
this pull request
Jun 7, 2026
Methods should be verbs. Fixes QwenLM#4.
chiga0
pushed a commit
that referenced
this pull request
Jun 8, 2026
…server integration tests - Add 'session.close.reason' attribute to telemetry event so operators can distinguish reaper-initiated closes from client-initiated ones in dashboards - Add test verifying channel idle timer fires after reaper closes the last session on a channel (design doc test #12) - Add server.test.ts integration tests: health endpoint reflects session count changes, DELETE /session passes no close opts - Update fakeBridge.closeSession signature to accept the new CloseSessionOpts third parameter Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…Index and initialIndex are in valid range to avoid accessing out-of-bounds items.
TLDR
This PR fixes the
TypeError: Cannot read properties of undefined (reading 'value')error in the RadioButtonSelect component by adding proper bounds checking for array access. The fix ensures safe handling of edge cases like empty arrays, out-of-bounds initial indices, and dynamic array changes.Dive Deeper
The RadioButtonSelect component was vulnerable to runtime errors when:
initialIndexwas greater than or equal toitems.lengthitemsarray was emptyactiveIndexbecame out-of-bounds due to dynamic array changesKey changes:
items[activeIndex]in all user input handlersThe fixes maintain the component's existing behavior while preventing all boundary condition crashes.
Reviewer Test Plan
Testing Matrix
Linked issues / bugs