fix: write session traces in headless serve mode#886
Conversation
Session tracing was implemented only at the terminal entrypoints — the TUI
worker and `run` construct the tracer and feed it events, but `serve` had no
trace wiring at all. Sessions driven over HTTP (e.g. the VS Code chat panel,
which spawns `altimate serve` and POSTs to `/session/{id}/prompt_async`)
never wrote `ses_*.json` files to `~/.local/share/altimate-code/traces/`:
the only hook in the shared session pipeline is `Tracer.active?.logSpan`,
and `Tracer.active` was never set in serve mode.
- Extract the TUI worker's inline event-stream→trace logic into a shared
`TraceConsumer` (`altimate/observability/trace-consumer.ts`): config
loading, per-session trace map, MAX_TRACES eviction, event feeding,
reset/flush
- Rewire `cli/cmd/tui/worker.ts` to use the shared consumer (behavior
unchanged)
- Add `subscribeTraceConsumer()` — an in-process `/event` subscription
mirroring the worker's loop — and start it in `ServeCommand`, so serve
sessions produce the same trace files as the terminal
- Tests: full event-sequence → completed trace file, interleaved sessions,
malformed events, disabled config, flush/reset semantics
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughA shared TraceConsumer centralizes session trace lifecycle, event routing, disk rehydration/eviction, and span logging; TUI delegates tracing to it and serve subscribes the in-process event stream. Tests and invariants are updated to validate rehydration, eviction, non-finalization-on-idle, and robustness. ChangesTraceConsumer and integration
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI/Main
participant Consumer as TraceConsumer
participant SDK as SDK_EventStream
participant Trace as Trace
CLI->>Consumer: subscribeTraceConsumer(directory)
Consumer->>SDK: subscribe(fetchFn)
SDK->>Consumer: stream event
Consumer->>Consumer: handleEvent(event)
Consumer->>Trace: getOrCreateTrace(sessionID) / append span
Trace->>Trace: record span / persist
CLI->>Consumer: stop()
Consumer->>Trace: flush() and await endTrace()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Real serve sessions emit trailing events after `session.status: idle` — the user message is re-published with its generated `summary` once title generation completes. Finalizing immediately on idle deleted the trace from the map, and the straggler then lazily re-created an EMPTY trace for the same sessionID whose finalization overwrote the rich `<sessionId>.json` file (observed end-to-end in docker code-server: 42KB trace clobbered down to 749 bytes). Fix: schedule finalization 2s after idle instead of finalizing inline; any event for the session during the grace window pushes the deadline back and is absorbed into the live trace. As a bonus the generated session title now lands in trace metadata. flush()/reset()/eviction clear pending timers. Verified: chat session in docker code-server (glob tool call + 2 generations) produces a complete 42KB trace with title, model, tool span, and token totals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The worker's inline event handling moved into the shared altimate/observability/trace-consumer.ts — point the cross-module event-type literal invariant at the new consumer (and cover the full set it consumes) instead of cli/cmd/tui/worker.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
Resolves the worker.ts conflict from main's trace-data-loss work (#895, #901-905), which landed in parallel with this branch's trace extraction. Both lines of work fix the same class of bug (traces losing data / being clobbered) from different angles: - this branch: extract the worker's inline event→trace logic into a shared TraceConsumer and wire it into `altimate serve`, so the VS Code chat panel (headless serve) writes traces at all (AI-6950). - main #895+: stop finalizing on session.status=idle (idle fires every turn, not at session end), rehydrate-from-disk on cache-miss re-creation, and a streamGeneration guard against concurrent stream teardown. Resolution: main's superior logic is the source of truth. The shared TraceConsumer is now a faithful 1:1 port of main's *current* worker logic (no idle finalize, rehydrateFromFile, streamGeneration guard, logUserMessage / setPrompt-vs-setTitle split, synthetic/ignored skip). The worker delegates to it; `serve` subscribes to it. This branch's earlier 2s grace-window heuristic is dropped entirely — main's rehydrate-on-cache-miss is the deterministic fix it was a stand-in for. Source-grep regression guards that asserted the trace logic lived in worker.ts are repointed at trace-consumer.ts (worker-trace-clearing, tracing-followups #902, bridge-merge-invariants), with the guard intent preserved. Verified: typecheck clean; trace-consumer, worker-trace-clearing, tracing-*, and bridge-merge-invariants suites all green; host-level `serve` smoke test writes a rich 42KB trace (session/user-message/generation spans).
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
There was a problem hiding this comment.
Pull request overview
This PR adds trace-file generation for sessions executed via the headless HTTP server (altimate serve) by extracting the TUI worker’s event-stream→trace logic into a shared TraceConsumer, then wiring that consumer into serve mode.
Changes:
- Introduces a shared
TraceConsumer(+ helpersubscribeTraceConsumer) to build per-session traces from bus events and write them via configured exporters. - Rewires the TUI worker to delegate all tracing/event handling and lifecycle operations (
loadConfig,reset,flush) toTraceConsumer. - Updates and adds tests to validate the new consumer behavior and to keep invariant checks aligned with the new source of event-literal usage.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/opencode/src/altimate/observability/trace-consumer.ts | Adds the shared event-stream consumer and an in-process subscriber for serve mode. |
| packages/opencode/src/cli/cmd/serve.ts | Starts the in-process trace consumer subscription in headless serve mode. |
| packages/opencode/src/cli/cmd/tui/worker.ts | Replaces inline tracing logic with the shared TraceConsumer. |
| packages/opencode/test/altimate/trace-consumer.test.ts | New unit tests for the extracted consumer’s event handling and file output. |
| packages/opencode/test/altimate/tracing-followups.test.ts | Updates source-contract tests to target TraceConsumer instead of the worker. |
| packages/opencode/test/cli/tui/worker-trace-clearing.test.ts | Updates regression tests to validate both worker delegation and consumer behavior. |
| packages/opencode/test/upstream/bridge-merge-invariants.test.ts | Updates event-type literal invariants to point at the new consumer location. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // altimate_change start — trace: session tracing in headless serve | ||
| // Sessions driven over HTTP (e.g. the VS Code chat panel) have no TUI | ||
| // worker observing the event stream, so traces were never written in | ||
| // serve mode. Subscribe the shared trace consumer to the in-process | ||
| // event stream so serve sessions produce the same trace files as the | ||
| // terminal entrypoints. | ||
| subscribeTraceConsumer({ directory: process.cwd() }) | ||
| // altimate_change end | ||
|
|
||
| await new Promise(() => {}) | ||
| await server.stop() |
| ;(async () => { | ||
| await consumer.loadConfig() | ||
| while (!signal.aborted) { | ||
| const events = await Promise.resolve(sdk.event.subscribe({}, { signal })).catch(() => undefined) | ||
|
|
||
| if (!events) { | ||
| await sleep(250) | ||
| continue | ||
| } | ||
|
|
||
| for await (const event of events.stream) { | ||
| await consumer.handleEvent(event) | ||
| } | ||
|
|
||
| if (!signal.aborted) { | ||
| await sleep(250) | ||
| } | ||
| } | ||
| })().catch((error) => { | ||
| Log.Default.error("trace event stream error", { | ||
| error: error instanceof Error ? error.message : error, | ||
| }) | ||
| }) | ||
|
|
||
| return { | ||
| stop: async () => { | ||
| abort.abort() | ||
| await consumer.flush() | ||
| }, | ||
| } |
| Trace.setActive(trace) | ||
| this.sessionTraces.set(sessionID, trace) |
| * The per-event logic here is a 1:1 port of the worker's inline handling so | ||
| * both front-ends behave identically: | ||
| * - traces are NOT finalized on `session.status: idle` — idle fires after | ||
| * every turn, not at session end; finalization happens on flush() | ||
| * (shutdown) and on MAX_TRACES eviction. Long-lived sessions keep their | ||
| * Trace in cache across turns. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
packages/opencode/test/altimate/trace-consumer.test.ts (1)
263-264: ⚡ Quick winReplace fixed sleep with deterministic completion in the reset test.
setTimeout(100)is timing-sensitive and can flake under slower CI. Prefer polling/assert-retry for file existence/content or exposing an awaitable reset-finalization path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/test/altimate/trace-consumer.test.ts` around lines 263 - 264, The test currently uses a fixed sleep (await new Promise(r => setTimeout(r, 100))) to wait for reset finalization; replace that with a deterministic wait by either (A) polling/assert-retry until the observable side-effect is present (e.g., check file existence or expected file content in a retry loop with short backoff and a timeout) or (B) modify the production/test API to expose an awaitable reset-finalization Promise and call that (e.g., await traceConsumer.resetFinalized() or await endTrace()). Update the test around endTrace/reset finalizes to remove the fixed setTimeout and use the chosen deterministic completion check so CI cannot flake.packages/opencode/src/cli/cmd/serve.ts (1)
26-33: 💤 Low valueConsider capturing the subscription for graceful shutdown.
The return value of
subscribeTraceConsumer()provides astop()method that aborts the subscription and flushes traces. Discarding it meansflush()is never called when the server terminates, potentially losing final trace state.This is consistent with the existing pattern (line 36
server.stop()is already unreachable), but capturing the subscription would enable proper cleanup if graceful shutdown handling is added later:- subscribeTraceConsumer({ directory: process.cwd() }) + const tracing = subscribeTraceConsumer({ directory: process.cwd() })A follow-up could add SIGINT/SIGTERM handlers to call
tracing.stop()andserver.stop().🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/cli/cmd/serve.ts` around lines 26 - 33, Capture the subscription returned by subscribeTraceConsumer({ directory: process.cwd() }) (it exposes a stop() method) into a variable (e.g., tracing) instead of discarding it, and ensure that tracing.stop() is invoked during shutdown/cleanup alongside server.stop() so traces are flushed; locate the subscribeTraceConsumer call and add the variable assignment and a place in your existing shutdown flow to call tracing.stop() where server.stop() is handled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/opencode/test/upstream/bridge-merge-invariants.test.ts`:
- Around line 167-169: The consumer-literal invariant for TraceConsumer is
missing "session.status" so tests won't catch producer/consumer drift; update
the invariant array used by the TraceConsumer test (the literal list defined
near TraceConsumer in bridge-merge-invariants.test.ts) to include
"session.status" alongside "message.updated", "message.part.updated", and
"session.updated" so the test asserts that TraceConsumer handles that event
literal.
---
Nitpick comments:
In `@packages/opencode/src/cli/cmd/serve.ts`:
- Around line 26-33: Capture the subscription returned by
subscribeTraceConsumer({ directory: process.cwd() }) (it exposes a stop()
method) into a variable (e.g., tracing) instead of discarding it, and ensure
that tracing.stop() is invoked during shutdown/cleanup alongside server.stop()
so traces are flushed; locate the subscribeTraceConsumer call and add the
variable assignment and a place in your existing shutdown flow to call
tracing.stop() where server.stop() is handled.
In `@packages/opencode/test/altimate/trace-consumer.test.ts`:
- Around line 263-264: The test currently uses a fixed sleep (await new
Promise(r => setTimeout(r, 100))) to wait for reset finalization; replace that
with a deterministic wait by either (A) polling/assert-retry until the
observable side-effect is present (e.g., check file existence or expected file
content in a retry loop with short backoff and a timeout) or (B) modify the
production/test API to expose an awaitable reset-finalization Promise and call
that (e.g., await traceConsumer.resetFinalized() or await endTrace()). Update
the test around endTrace/reset finalizes to remove the fixed setTimeout and use
the chosen deterministic completion check so CI cannot flake.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d4fd89e9-9c10-41b7-8c0a-ac1930f33804
📒 Files selected for processing (7)
packages/opencode/src/altimate/observability/trace-consumer.tspackages/opencode/src/cli/cmd/serve.tspackages/opencode/src/cli/cmd/tui/worker.tspackages/opencode/test/altimate/trace-consumer.test.tspackages/opencode/test/altimate/tracing-followups.test.tspackages/opencode/test/cli/tui/worker-trace-clearing.test.tspackages/opencode/test/upstream/bridge-merge-invariants.test.ts
| import { describe, expect, test, beforeEach, afterEach } from "bun:test" | ||
| import fs from "fs/promises" | ||
| import path from "path" | ||
| import os from "os" | ||
| import { TraceConsumer } from "../../src/altimate/observability/trace-consumer" | ||
| import { FileExporter, type TraceFile } from "../../src/altimate/observability/tracing" | ||
|
|
||
| let tmpDir: string | ||
|
|
||
| beforeEach(async () => { | ||
| tmpDir = path.join(os.tmpdir(), `trace-consumer-${Date.now()}-${Math.random().toString(36).slice(2)}`) | ||
| await fs.mkdir(tmpDir, { recursive: true }) | ||
| }) | ||
|
|
||
| afterEach(async () => { | ||
| await fs.rm(tmpDir, { recursive: true, force: true }).catch(() => {}) | ||
| }) |
There was a problem hiding this comment.
Use tmpdir() fixture with await using instead of module-level os.tmpdir() lifecycle hooks.
This new test file should follow the repository temp-dir convention to guarantee scoped cleanup and consistency with the tracing test suite.
As per coding guidelines, packages/opencode/test/**/*.test.ts must use tmpdir from fixture/fixture.ts with await using, and based on learnings this is required for new files under packages/opencode/test/altimate/.
Sources: Coding guidelines, Learnings
| file: "altimate/observability/trace-consumer.ts", | ||
| types: ["message.updated", "message.part.updated", "session.updated"], | ||
| }, |
There was a problem hiding this comment.
Include session.status in the TraceConsumer consumer-literal invariant list.
TraceConsumer processes session.status events, so this invariant should also assert that literal to prevent silent producer/consumer drift on that path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/opencode/test/upstream/bridge-merge-invariants.test.ts` around lines
167 - 169, The consumer-literal invariant for TraceConsumer is missing
"session.status" so tests won't catch producer/consumer drift; update the
invariant array used by the TraceConsumer test (the literal list defined near
TraceConsumer in bridge-merge-invariants.test.ts) to include "session.status"
alongside "message.updated", "message.part.updated", and "session.updated" so
the test asserts that TraceConsumer handles that event literal.
There was a problem hiding this comment.
Declining this one: TraceConsumer does not consume session.status — the only occurrences of that string are in explanatory comments ("DO NOT finalize on session.status=idle"), not a handled event literal. The invariant test matches a quoted "session.status" string, so adding it to the list would fail CI (there is no such literal in the file). The consumed literals (message.updated, message.part.updated, session.updated, session.deleted) are all registered.
There was a problem hiding this comment.
@anandgupta42, you're right — I was mistaken. session.status appears only in guard-comment prose (e.g. "DO NOT finalize on session.status=idle") and is not an event literal that TraceConsumer handles. Adding it to the invariant array would cause a spurious CI failure. The four actually-consumed literals (message.updated, message.part.updated, session.updated, session.deleted) are already covered. Apologies for the noise.
(ᵔᴥᵔ)
✏️ Learnings added
Learnt from: anandgupta42
Repo: AltimateAI/altimate-code PR: 886
File: packages/opencode/test/upstream/bridge-merge-invariants.test.ts:167-169
Timestamp: 2026-06-08T08:38:51.662Z
Learning: In `packages/opencode/src/altimate/observability/trace-consumer.ts`, the phrase `session.status` appears only in explanatory comments (e.g. "DO NOT finalize on `session.status=idle`") and is NOT a consumed event literal. The actual event literals handled by `TraceConsumer` are: `"message.updated"`, `"message.part.updated"`, `"session.updated"`, and `"session.deleted"`. Do not suggest adding `"session.status"` to invariant lists for this file.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: sahrizvi
Repo: AltimateAI/altimate-code PR: 895
File: packages/opencode/test/altimate/tracing-rehydrate.test.ts:0-0
Timestamp: 2026-06-05T17:51:12.678Z
Learning: In `packages/opencode/test/altimate/` (tracing tests), NEW test files should follow the documented convention and use `await using tmp = await tmpdir()` (imported from `fixture/fixture.ts`) with per-test scoping, not a module-level `os.tmpdir()` + `beforeEach`/`afterEach` pattern. The existing sibling tests (`tracing-display-crash`, `tracing-rename-race`, `tracing-adversarial-*`) predate this convention and still use the manual pattern; migrating them is a separate cleanup PR. Do not suppress `tmpdir()` fixture suggestions for brand-new files added to this directory.
Learnt from: kulvirgit
Repo: AltimateAI/altimate-code PR: 405
File: test/sanity/phases/resilience.sh:214-223
Timestamp: 2026-03-30T20:48:47.930Z
Learning: In `test/sanity/phases/resilience.sh`, the sanity tests run in a Docker container against the npm-installed `altimate-code` binary, not the TypeScript source tree. TypeScript source files (`.ts`) are compiled into the binary and cannot be `require()`-d directly. Module-load checks in this phase should be structured as runtime smoke tests (e.g., invoking the CLI or checking the binary's behavior), not as direct `require('./src/...')` TypeScript imports. Suggesting direct `.ts` module imports for these tests is incorrect.
Learnt from: CR
Repo: AltimateAI/altimate-code PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-05-02T23:15:06.220Z
Learning: Applies to packages/opencode/**/*.test.{ts,tsx} : Migration tests should read the per-folder layout without relying on `_journal.json`
There was a problem hiding this comment.
4 issues found across 7 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/opencode/src/altimate/observability/trace-consumer.ts">
<violation number="1" location="packages/opencode/src/altimate/observability/trace-consumer.ts:138">
P1: `Trace.setActive(trace)` sets a single process-global active tracer. In serve mode, multiple sessions can run concurrently (e.g., `POST /session/:id/prompt_async` doesn't await completion), so this global will point at whichever session was most recently created. Any code using `Tracer.active?.logSpan(...)` during an earlier session's execution will attribute spans to the wrong session trace, causing cross-session trace corruption. The active trace should be scoped per async context (e.g., `AsyncLocalStorage`) or serve mode should avoid setting the global active tracer.</violation>
<violation number="2" location="packages/opencode/src/altimate/observability/trace-consumer.ts:249">
P2: `session.updated` title is silently dropped when it arrives before message events, because the handler only looks up cached traces instead of falling back to `getOrCreateTrace`.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| if (e.type === "session.updated") { | ||
| const info = e.properties?.info as Record<string, any> | undefined | ||
| if (info?.id && info?.title) { | ||
| const trace = this.sessionTraces.get(info.id) |
There was a problem hiding this comment.
P2: session.updated title is silently dropped when it arrives before message events, because the handler only looks up cached traces instead of falling back to getOrCreateTrace.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/observability/trace-consumer.ts, line 249:
<comment>`session.updated` title is silently dropped when it arrives before message events, because the handler only looks up cached traces instead of falling back to `getOrCreateTrace`.</comment>
<file context>
@@ -0,0 +1,353 @@
+ if (e.type === "session.updated") {
+ const info = e.properties?.info as Record<string, any> | undefined
+ if (info?.id && info?.title) {
+ const trace = this.sessionTraces.get(info.id)
+ if (trace) trace.setTitle(String(info.title))
+ }
</file context>
worker.ts is an upstream-shared file; the traceConsumer.reset() call (which replaced main's inline sessionTraces clear) must be wrapped so the upstream merge tooling protects it. Caught by script/upstream/analyze.ts --markers --strict (the Marker Guard CI check).
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
1 similar comment
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
🤖 Multi-Model Consensus Code ReviewReviewed by 8 participants — Claude + GPT 5.4, Gemini 3.1 Pro, Kimi K2.5, MiniMax M2.7, GLM-5, Qwen 3.6 Plus, MiMo-V2-Pro. Convergence: 1 round, verdict unanimous (6 APPROVE on the synthesized findings + 1 wording-caveat that accepts the severity/verdict). DeepSeek V3.2 excluded (provider error). Code Review Summary — PR #886 "fix: write session traces in headless serve mode"Verdict: REQUEST CHANGES The core change is high quality: the TUI worker's ~130 lines of inline event→trace logic are extracted into a shared, well-tested Major Issues1. subscribeTraceConsumer({ directory: process.cwd() }) // { stop } discarded
await new Promise(() => {}) // blocks forever
await server.stop() // dead code
Precise impact (verified, not "all traces lost"): incremental The codebase already has the right pattern: const traceSub = subscribeTraceConsumer({ directory: process.cwd() })
const shutdown = async () => { await traceSub.stop(); await server.stop(); process.exit(0) }
process.once("SIGINT", () => void shutdown())
process.once("SIGTERM", () => void shutdown())
await new Promise(() => {})Note: 2. stop: async () => { abort.abort(); await consumer.flush() }
const loopPromise = (async () => { /* existing while loop */ })()
return { stop: async () => { abort.abort(); await loopPromise.catch(() => {}); await consumer.flush() } }3. The event-stream loop dies permanently on a mid-stream error (Bug/Resilience) — for await (const event of events.stream) { await consumer.handleEvent(event) }The try {
for await (const event of events.stream) { await consumer.handleEvent(event) }
} catch (err) {
Log.Default.warn("trace event stream disconnected", { error: err instanceof Error ? err.message : String(err) })
}While here, add exponential backoff to the 250ms reconnect (cap ~10–30s) to avoid hot-looping when the stream is durably down (Kimi/GLM-5/MiMo). 4. The serve integration path is untested (Testing) — Minor Issues5. Config-load failure silently falls back to default tracing (Bug) — 6. Silent 7. 8. reset() fire-and-forget test relies on a fixed Nits9. Unguarded concurrent 10. 11. 12. Positive Observations
Missing Tests (summary)
|
Addresses the multi-model consensus review on PR #886. The `TraceConsumer` extraction was sound, but the `serve` integration had lifecycle gaps. `cli/cmd/serve.ts`: - Capture the `subscribeTraceConsumer` handle and wire `SIGINT`/`SIGTERM`/ `beforeExit` to `stop()` so traces are finalized on shutdown instead of left un-`completed`. Mirrors the existing pattern in `cli/cmd/run.ts`. `altimate/observability/trace-consumer.ts`: - `stop()` now drains the event loop (bounded by a 1s timeout) before `flush()`, so finalization can't race an in-flight `handleEvent`. - The `for await` stream loop catches mid-stream errors and reconnects with exponential backoff (250ms -> 30s) instead of dying permanently; backoff sleeps are abortable so shutdown isn't delayed. - Finalize and evict per-session state on `session.deleted` (a real end-of-session signal) so long-lived serve processes don't accumulate `sessionUserMsgIds` / cached traces. - `loadConfig` failure now fails safe (`enabled = false`) instead of silently tracing to the default dir, and logs at debug. - `getOrCreateTrace` and `handleEvent` catch blocks log at debug (still never throw) for headless diagnosability. - Auth header uses `Buffer.from(...).toString("base64")` (UTF-8 safe) instead of `btoa`. - Added a test seam: `subscribeTraceConsumer` accepts an optional injected `consumer` and `subscribe` source (production behaviour unchanged via defaults). Tests: - New `subscribeTraceConsumer` coverage: `stop()` drains + finalizes; a mid-stream throw reconnects rather than killing the loop. - New `TraceConsumer` coverage: incremental snapshots land on disk BEFORE any flush (the real serve path), and `session.deleted` finalizes + releases state. - De-flaked the `reset()` test (poll instead of fixed `setTimeout`). - Registered the `session.deleted` literal in `bridge-merge-invariants`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
✅ Review fixes applied — commit a701fdfAddressed the consensus review in one commit on this branch: Major
Minor Nit Nits 9 (concurrent Verified locally: marker guard ✅, typecheck ✅, |
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
There was a problem hiding this comment.
2 issues found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/opencode/src/altimate/observability/trace-consumer.ts">
<violation number="1" location="packages/opencode/src/altimate/observability/trace-consumer.ts:249">
P2: `session.updated` title is silently dropped when it arrives before message events, because the handler only looks up cached traces instead of falling back to `getOrCreateTrace`.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
❌ Tests — Failures DetectedTypeScript — 15 failure(s)
Next StepPlease address the failing cases above and re-run verification. |
Follow-up to the consensus-review fixes, addressing cubic-dev-ai comments: - serve.ts: SIGINT/SIGTERM shutdown now exits with signal-conventional codes (130 / 143) instead of 0, so a signalled stop isn't masked as a successful run. beforeExit still exits 0. Matches cli/cmd/run.ts. - trace-consumer.ts (loadConfig): a config-load error no longer disables tracing — it keeps `enabled` true and falls back to Trace.create()'s default exporter (the original "must not prevent the host from working" intent), but now logs a warning so the fallback isn't silent. - trace-consumer.ts (getOrCreateTrace): drop the dead `Trace.setActive(trace)` call. It sets a process-global active trace that nothing reads and that is a footgun in multi-session serve (last-event-wins); per-session routing is via the sessionTraces map. Verified: marker guard, typecheck, and trace-consumer / tracing-followups / worker-trace-clearing / bridge-merge-invariants suites all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
1 similar comment
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
Five-persona pre-release review of #886 (session traces in headless serve). No P0/HOLD; these are the actionable polish items: - trace-consumer.ts (flush): finalize sessions concurrently (Promise.allSettled) instead of sequentially. Each endTrace() can wait on an HttpExporter (5s timeout), so a sequential loop over N sessions could exceed a container's shutdown grace period and be SIGKILL'd mid-write. Now bounded to the slowest single trace. - trace-consumer.ts (reconnect): log the in-process stream reconnect at debug, not warn — it's automatic and not user-actionable, and warn-per-backoff spiked log aggregators during an outage. - serve.ts: comment clarifying that subscribeTraceConsumer's `directory` is the SDK workspace context, not the trace output dir (flagged by 4 reviewers). - docs: ide.md now lists session tracing as an IDE feature; trace.md crash- recovery notes that `serve` (not just `run`) finalizes traces on shutdown. Deferred (filed): #915 (container trace persistence), #916 (serve trace-dir startup log). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
Sessions run through the headless server (
altimate serve) never write trace files. The VS Code "Altimate Code" chat panel drives all its sessions this way — it spawnsaltimate serve --port <n>and POSTs to/session/{id}/prompt_async— so chat sessions produce noses_*.jsonunder~/.local/share/altimate-code/traces/, while terminal sessions (TUI,run) do.Root cause
Session tracing is implemented at the client-entrypoint layer, not in the session engine:
cli/cmd/run.tsbuilds exporters and callsstartTrace()/endTrace()itselfcli/cmd/tui/worker.tssubscribes to the event stream and feeds per-session traces inlinecli/cmd/serve.tshad no trace wiring at allThe only hook in the shared session pipeline is
Tracer.active?.logSpan(...)(session/prompt.ts), andTracer.activeis only ever set by the two terminal entrypoints — in serve mode it is permanentlynull, so every span silently no-ops.Fix
altimate/observability/trace-consumer.ts(new) — extracts the TUI worker's inline event-stream→trace logic into a sharedTraceConsumer: config loading (tracing.*), per-session trace map,MAX_TRACESeviction, event feeding (message.updated,message.part.updated,session.updated,session.status),reset()/flush(). Also exportssubscribeTraceConsumer()— a standalone in-process/eventsubscription mirroring the worker's loop, for hosts that don't have one.cli/cmd/tui/worker.ts— rewired to the shared consumer; behavior unchanged (same events, same eviction, same shutdown flush).cli/cmd/serve.ts— startssubscribeTraceConsumer({ directory: process.cwd() })afterServer.listen(), so serve sessions write the same trace files as the terminal.session.status: idle(the user message is re-published with its generatedsummaryonce title generation completes). Finalizing inline on idle let the straggler re-create an empty trace whose finalization overwrote the rich<sessionId>.json. The consumer now schedules finalization 2s after idle; any event for the session during the window pushes the deadline back and is absorbed into the live trace (which also lands the generated title in trace metadata).Known limitation / future improvement
The 2s finalize grace window is a heuristic. There is no bus signal for "no more events for session X" —
session.status: idleis the engine's done-signal, but title generation completes after idle and re-publishes the user message. If that LLM call ever takes longer than the window, the straggler would still re-create an empty trace and overwrite the rich file. A deterministic guard is possible without engine changes: remember finalized sessions' known message IDs and only re-create a trace for events referencing an unseen message ID (a genuine new turn), dropping known-ID re-publishes. Deferred to keep this PR minimal.Verification
bun typecheckclean; full tracing suite (107 tests) passes unchangedtest/altimate/trace-consumer.test.ts(7 tests): full event sequence → completed trace file with generation/tool spans + metadata, interleaved sessions, post-idle straggler doesn't clobber the trace, malformed events never throw,tracing.enabled=falsewrites nothing,flush()/reset()semanticsmainthe traces dir stays empty; on this branchses_*.jsonis writtencodercom/code-server, fixed binary at/usr/local/bin/altimate, chat panel opened, agentic message sent (glob tool call + 2 generations). Resulting trace: 42KB,completed, spanssession/system-prompt/generation/tool(glob)/generation, auto-generated title in metadata, 62.8k tokens.Left: the chat session that was driven through the panel (glob tool call, model listing, 3*7). Right: a verification command run afterwards in the same container's integrated terminal —
lsof the traces directory proving the chat session'sses_*.jsonexists locally, plus anodeone-liner printing the trace's contents to show it's complete, not an empty shell:🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Tests