Skip to content

v15.10.10

Latest

Choose a tag to compare

@github-actions github-actions released this 09 Jun 21:36
· 23 commits to main since this release
fd51b8b

@oh-my-pi/pi-ai

Added

  • Exported wrapFetchForCch so non-streaming OAuth callers (e.g. the web-search provider) can patch the Claude Code billing-header cch attestation into their request bodies instead of shipping the cch=00000 placeholder.

Fixed

  • Fixed an unbounded, zero-backoff Codex WebSocket reconnect loop on websocket_connection_limit_reached: the no-content reconnect path never consulted the retry budget and never waited, hammering the endpoint forever when the limit is account-scoped. Reconnects are now budgeted and delayed like every other WS retry path, falling back to a single SSE replay when exhausted.
  • Fixed the Codex whitespace-loop breaker not observing degenerate frames that arrive after their item closed (or before it opened) — those frames count as stream progress, so the idle watchdogs never fired and the turn hung forever, which is exactly the failure mode the breaker exists for. Whitespace-loop recovery now also refuses to replay the turn once a toolcall_end was delivered, surfacing the error instead of re-emitting the same tool calls.
  • Fixed the two remaining Codex retry paths (WS mid-stream reconnect and the empty-content SSE fallback) leaking blockless native output items (e.g. web_search_call) from the failed attempt into the replayed turn's providerPayload and append baseline.
  • Fixed Codex WebSocket failure handling closing whatever connection currently occupies the session slot — including a concurrent caller's in-flight CONNECTING handshake, whose rejection (websocket closed before open) is classified fatal and disabled WebSockets for the whole session. Failure cleanup now skips CONNECTING sockets and the pool re-joins replacement handshakes (bounded).
  • Fixed the Codex request transformer not repairing orphan custom_tool_call_output items (only function_call_output was folded into an assistant note) — a compaction splice that dropped an apply_patch call while keeping its result produced a hard 400 on the default GPT-5 Codex toolset.
  • Fixed processResponsesStream finalizing reasoning items via a bare itemId content scan instead of the routed entry: with id-less reasoning items (local hosts), every output_item.done matched the FIRST thinking block — the second item's text clobbered it and the second block was never finalized or signed.
  • Fixed processResponsesStream dropping tool calls and message text whose output_item.added event was lost (lossy proxies): toolcall_end was emitted with a dangling contentIndex while the call never entered message.content, so the agent loop silently never executed it. The done handler now synthesizes the missing block; still-open tool-call blocks are also final-parsed at response.completed so the toolUse override cannot hand the agent stale {} arguments.
  • Fixed response.incomplete with incomplete_details.reason: "content_filter" being reported as a token-cap truncation (stopReason: "length") — the agent loop's length recovery then asked the model to "shorten" a filtered prompt. Content-filtered turns now surface as errors; usage is also populated from response.failed events, and an unknown terminal status degrades to "stop" with a logged anomaly instead of throwing away a fully-streamed response.
  • Fixed Copilot premiumRequests accounting being dropped from failed/cancelled responses: populateResponsesUsageFromResponse replaced usage wholesale and the error path threw before the success-path re-apply. The populate now preserves the field.
  • Fixed deduplicateToolCallIds suffixing the whole composite Responses id (callId|itemId) — normalizeResponsesToolCallId extracts the first segment as the wire call_id at encode time, so both copies collapsed back onto one call_id and the request carried duplicate call/output pairs. The suffix and length budget now apply per segment.
  • Gated native history payload replay on api + model id in both Responses providers: after a mid-session model switch, reasoning items carrying encrypted content minted by the previous model were replayed verbatim under the new model. Replay now falls back to block re-encode (which already strips foreign signatures), matching transformMessages' same-model trust rule.
  • Fixed Azure OpenAI Responses requests omitting store: false while requesting reasoning.encrypted_content (stateless-only per OpenAI), replaying custom tool calls paired with mismatched function_call_output items (customCallIds was never threaded through), letting the SDK's internal retries (maxRetries 5) silently re-POST inside the explicit first-event deadline, and sending a prompt_cache_key when the caller opted out via cacheRetention: "none".
  • Fixed strict-pairing Responses backends (Azure, Copilot) silently discarding tool results whose call is absent from history — the result is now folded into an assistant note (same shape as orphan-output repair) so the model keeps the information.
  • Fixed the OpenAI Responses first-event watchdog staying armed across the onResponse notification callback (a slow callback aborted an already-connected stream), Copilot transient-model retries re-attempting on an already-aborted signal (instant dead retry surfacing the scheduler's AbortError), Codex reasoningSummary: null being coerced to "auto" (the documented omit-summary contract was unreachable), nested Codex error codes (response.error.code) being invisible to the connection-limit/previous-response recovery matchers, and the session id leaking unredacted into PI_CODEX_DEBUG logs via the x-client-request-id header.
  • Fixed processResponsesStream (shared by openai-responses and azure-openai-responses) ignoring the terminal response.incomplete event: a max-output-tokens-truncated response ended with stopReason: "stop", zero usage, and no cost instead of "length" with the reported token counts. response.incomplete is now handled alongside response.completed and counts as stream progress for the idle watchdogs.
  • Fixed custom tool-call content blocks keeping the transient partialJson accumulation buffer (and a potentially stale arguments.input) after response.output_item.done in the shared Responses stream processor — the function_call branch already cleaned these up.
  • Fixed two OpenAI Codex stream-retry paths (whitespace-loop recovery and retryable provider errors) leaking native output items from the abandoned attempt into the replayed turn's providerPayload — stale reasoning items completed before the failure were re-sent as history input on subsequent requests alongside the retry's own items.
  • Fixed the Codex WebSocket queue wiping already-received frames when a transport error arrived: a response.completed queued just before an eager server close was discarded, turning a finished response into a spurious websocket closed failure and a full request replay. Errors now append behind pending data frames.
  • Fixed concurrent getOrCreateCodexWebSocketConnection callers (prewarm racing the first request) tearing down each other's in-flight handshake — closing a CONNECTING socket rejected the other caller with a fatal websocket closed before open, disabling WebSockets for the entire session. Callers now join the pending handshake.
  • Stopped the Codex connection-limit recovery from replaying a turn over SSE after a toolcall_end had already been delivered to the consumer (canSafelyReplayWebsocketOverSse guard was bypassed, re-emitting the same tool calls); the error now surfaces instead.
  • Extended the Codex whitespace-only argument-delta circuit breaker to custom_tool_call_input.delta frames, which counted as stream progress and could keep a degenerate response alive forever with no cap on buffer growth.
  • Fixed Codex stream failures during transport open reporting a synthetic request dump (empty URL/body) instead of the real request, and a response.created event resetting the recorded time-to-first-token.
  • Fixed the Codex WebSocket connect watchdog timer leaking (pinning the event loop for up to 10s) when the request signal aborted before or during the handshake.
  • Fixed OpenRouter-hosted Anthropic adaptive reasoning models (Claude Fable/Mythos 5 and Opus 4.6+) so the catalog exposes xhigh; Fable/Mythos and Opus 4.7+ requests now map user high/xhigh onto OpenRouter's Anthropic xhigh/max effort scale.
  • Fixed an unknown Anthropic stop_reason failing the whole turn after the response had fully streamed. mapStopReason threw on unrecognized values, and since the reason arrives on the trailing message_delta the error was unretryable — the live model_context_window_exceeded stop reason (default on Sonnet 4.5+) hit this path. It now maps to length, and any future unknown reason degrades to a logged anomaly plus a normal stop instead of an error.
  • Stopped clamping API-key Anthropic requests to Claude Code's 64k output cap. The CLAUDE_CODE_MAX_OUTPUT_TOKENS clamp exists to match the OAuth wire fingerprint, but buildParams applied it unconditionally, silently halving the output budget of 128k-output models (e.g. Opus 4.8) for API-key callers. OAuth requests keep the clamp.
  • Stopped a successful strict-tools fallback from shipping errorMessage on a stopReason: "stop" assistant message. After a grammar-too-large 400 triggered the non-strict retry, the original 400 text was kept on the final message even when the retry succeeded — consumers that treat errorMessage presence as failure (e.g. balance probes) misclassified the turn, and the stale text suppressed later refusal explanations. The fallback is now logged instead.
  • Fixed model-supplied User-Agent headers being silently dropped on non-OAuth Anthropic requests. enforcedHeaderKeys filtered the header out of modelHeaders in every branch but only the OAuth branch set one back; the Cloudflare-gateway, bearer-gateway, and X-Api-Key branches now forward the caller's value verbatim.
  • Stopped sending the fast-mode-2026-02-01 beta header once a session has learned the endpoint+model rejects fast mode (fastModeDisabled provider state), matching the already-dropped speed param.
  • Stopped buildAnthropicHeaders defaulting API-key requests onto the full Claude Code OAuth beta list (oauth-2025-04-20, claude-code-20250219, …). The claudeCodeBetas default is now OAuth-gated, matching the streaming path — the web-search header builder was the only caller hitting the default, so API-key search requests now carry just their own betas (e.g. web-search-2025-03-05). An empty anthropic-beta header is omitted entirely instead of being sent as an empty string.
  • Fixed image-bearing developer messages being upgraded to mid-conversation system turns on Opus 4.8+/Fable/Mythos 5. System content is text-only on the wire, so a developer turn carrying image blocks in an upgrade-eligible position produced a 400; it now stays a user message.
  • Fixed a spliced reconnect's second envelope overwriting the completed Anthropic message: message_delta was not gated by the terminal-stop flag (content events and duplicate message_start were), so the splice's stop_reason/usage replaced the finished turn's — a tool_use turn could be relabeled stop, and the harness then never executed the streamed tool calls. Post-terminal deltas are now logged as envelope anomalies and skipped.
  • Fixed a ping arriving before message_start consuming the Anthropic first-event watchdog: the stall was then classified as a terminal mid-stream idle timeout instead of a retryable first-event timeout. Pings no longer count as the first item but still refresh the idle deadline once content is flowing.
  • Fixed Anthropic-compatible proxies that omit usage/delta objects from message_start/message_delta/content_block_* envelopes crashing the turn with an unretryable TypeError; the missing payloads now degrade to logged envelope anomalies like every other malformed-frame case.
  • Fixed applyPromptCaching placing cache_control on thinking/redacted_thinking blocks — Anthropic rejects that with a 400. A thinking-only assistant turn inside the trailing cache window (e.g. followed by the synthetic Continue. pad) no longer receives a breakpoint.
  • Fixed consecutive assistant params reaching the wire when an empty user/developer turn between two assistant turns was dropped by the converter (e.g. an empty "nudge" submission after a length-truncated reply); Anthropic 400s on non-alternating assistant turns, and the broken triple replayed on every subsequent request. A user: "Continue." separator is now inserted, mirroring the trailing-prefill fallback.
  • Fixed supportsAdaptiveThinkingDisplay misparsing bare dated Opus ids: claude-opus-4-20250514 (Opus 4.0) parsed as minor 20250514 ≥ 4.7, which silently dropped the interleaved-thinking-2025-05-14 beta for API-key Opus 4.0 requests.
  • Fixed output_config.effort shipping without the effort-2025-11-24 beta on thinking-off requests against adaptive-only Claude models (the effort:"low" pin), and the mid-conversation system role shipping without mid-conversation-system-2026-04-07 on API-key and OAuth-utility requests; both betas are now added whenever the request can carry the corresponding field.
  • Fixed GitHub Copilot anthropic-messages requests going out with no Content-Type and no anthropic-version header — the copilot branch builds its headers from scratch and Bun's fetch does not default Content-Type for string bodies. Both headers are now pinned to match every other branch.
  • Fixed Anthropic client/provider retry multiplication: with the first-event watchdog disabled (PI_STREAM_FIRST_EVENT_TIMEOUT_MS=0), the client's internal maxRetries: 5 reactivated and stacked with the provider loop's 3 retries — up to 24 wire attempts with double backoff. The provider now pins per-request maxRetries: 0 unconditionally.
  • Fixed AnthropicMessagesClient spreading fetchOptions after the core request fields, letting a caller-supplied signal/method/body silently disconnect the timeout controller or corrupt the request. Transport extras (TLS) still pass through; core fields now always win.
  • Fixed Foundry mTLS/CA material being cached for the process lifetime when the env vars point at files: the cache key now folds in the file mtime so on-disk certificate rotation takes effect.
  • Fixed the Claude Code fingerprint version drifting across surfaces: the usage endpoint (claude-cli/2.1.160) and OAuth bootstrap (claude-code/2.1.160) pinned a stale version while /v1/messages reported 2.1.165; both now derive from claudeCodeVersion.
  • Fixed a system prompt that merely mentions x-anthropic-billing-header: mid-text suppressing the entire Claude Code system-block injection (billing header, instruction, and cch attestation); the resumed-session guard now anchors with startsWith.
  • Fixed lone surrogates in cross-API tool-call arguments reaching Anthropic's strict UTF-8 validation: replayed OpenAI/Google-origin tool_use.input string leaves are now deep-sanitized with toWellFormed(), while same-API Anthropic arguments stay byte-identical to keep prompt-cache prefixes stable.
  • Bounded the many-image resize fan-out to 4 concurrent decodes (it previously decoded every oversized image at once, two encode pipelines each — multi-GB transient memory at the 20+-image threshold that activates the feature).
  • Fixed mergeHeaders merging case-sensitively on the Copilot/client-options path, where a miscased user-configured header (e.g. authorization next to the synthesized Authorization) survived as two keys that the Headers constructor joins comma-separated on the wire.
  • Hardened the Anthropic stream lifecycle: prologue failures (e.g. a malformed Copilot credential in buildCopilotDynamicHeaders) and error-finalization failures now surface as an error event instead of an unhandled rejection that left stream.result() hanging forever; the spurious "cch billing placeholder not patched" warning no longer fires when the placeholder only appears in user content.

@oh-my-pi/pi-coding-agent

Added

  • Added a read-only view op to the todo tool that echoes the current list without mutating state, so the agent can recover exact task text instead of guessing it from memory.

Changed

  • Rewrote the bash tool's coreutils guidance (tool prompt and system prompt) around an explicit litmus: pipelines that compute a new fact (wc -l, sort | uniq -c, comm, diff) are legitimate bash, while commands that merely move, page, or trim bytes a dedicated tool can fetch remain banned — output trimming destroys data the artifact:// capture would have saved.

Fixed

  • Fixed the model selector dropping an immediate Enter when cached models were available but the selector's offline refresh was still pending.
  • Fixed dynamic import(...) inside functions passed to the browser tool's tab.evaluate/page.evaluate failing with __omp_import__ is not defined. The eval/browser JS runtime rewrites dynamic-import callees to the worker-injected __omp_import__ helper, but puppeteer serializes evaluate callbacks with Function.prototype.toString() and re-runs them inside the page, where the helper does not exist. The rewriter now substitutes a guarded shim that falls back to native dynamic import when the helper is absent, so serialized code works in the page realm while in-worker imports keep resolving against the session cwd.
  • Transcript block freezing is now unconditional instead of gated on ED3-risk terminal detection: every finalized block replays its frozen snapshot once it crosses out of the live region, on all terminals including Windows, because the rewritten renderer's committed scrollback is immutable everywhere. Still-mutating blocks (pending tools, streaming messages, async thinking renderers) anchor the live region and keep repainting until they finalize, which structurally fixes stale/duplicated output from late async expansions (#1823).
  • Fixed the edit tool's post-edit diff preview occasionally echoing a context line twice with out-of-order numbering. Block-boundary context injection classified space-prefixed diff rows as old-file-only, so an unchanged line sitting in a net-offset region (old N / new N+k) was missing from the new file's visibility window; findBlockContextLines then re-surfaced it under its post-edit number and the row was spliced in after the adjacent change run. New-file boundary lines are now translated back to pre-edit numbers (the compact-preview renumbering contract) and merged into a single old-numbered insertion pass — also fixing closers below a net-offset edit being dropped or renumbered incorrectly.
  • Fixed the Anthropic web-search provider claiming the Claude Code identity on API-key requests: the CC billing header + system instruction were injected whenever the model wasn't Haiku 3.5, regardless of auth mode. Injection is now OAuth-gated like the streaming path, and OAuth search requests patch the billing header's cch attestation (via wrapFetchForCch) instead of shipping the cch=00000 placeholder.
  • Fixed long streamed content appearing cut off mid-run: scrolled-off rows were erased from the viewport without ever being appended to terminal history. The transcript's commit boundary (deriveLiveCommitState) was all-or-nothing per block — one perpetually rewriting row (a task tool's ticking progress tree, per-agent cost/tool counters, spinner stats) suspended scrollback commits for the entire block, so once the block outgrew the viewport its static head (e.g. a task's prompt/context markdown) was neither committed nor on screen until the tool sealed, and was lost outright if the session ended mid-run. A stable-prefix ratchet now promotes leading rows that stayed visibly identical for a full 30-frame window as commit-safe, so the settled head reaches native scrollback while only the genuinely volatile tail stays deferred; a rewrite above the promoted run retreats the boundary and the engine audit recommits (duplication, never loss).
  • Fixed local tiny-title worker stdout/stderr leaking raw native model output such as </title> and cache/status lines into the interactive TUI scrollback (#2206).
  • Fixed task-agent discovery advertising Claude Code custom agents from .claude/agents/*.md as OMP subagents; direct task-agent discovery now only loads OMP-native .omp agent roots, while Claude marketplace plugin agents keep their existing provider path (#2209).

Removed

  • Removed the clearOnShrink setting and its PI_CLEAR_ON_SHRINK environment variable: the rewritten renderer always clears shrunken rows exactly, so the flicker/perf tradeoff the setting controlled no longer exists. Existing config entries are ignored.
  • Removed the prompt-submit native-scrollback reconciliation checkpoint and the eager streaming render mode from the interactive controllers — the renderer's append-only contract made both obsolete.

@oh-my-pi/pi-tui

Fixed

  • Fixed committed transcript rows silently vanishing when a component re-laid-out content the engine had already scrolled into native history — a TTSR stream rewind truncating a streamed block, or the image budget demoting a committed inline image to its one-line fallback, shifted every row below by the height delta and the engine kept committing from the stale index, skipping that many rows of everything after (missing interruption banners, half-cut images in scrollback). The engine now audits its committed prefix every ordinary frame: an in-place edit or restyle keeps its alignment (stale styling in history remains the accepted artifact), while any shift re-anchors the commit index at the first moved row and recommits from there — history keeps the stale copy and gains a fresh one. Duplication, never loss. The detector (findCommittedPrefixResync, exported for the stress harness's shadow ledger) samples the prefix tail SGR-stripped so theme restyles and single-row edits never trigger spurious recommits.
  • Fixed budget-demoted inline images shrinking their transcript block: the text fallback is now height-preserving once a graphic has rendered (reserved rows plus the fallback line), so demotion never shifts content below a committed image.
  • Fixed stale trailing cells bleeding into committed history on combining-heavy rows: the native width model can over-count Arabic/combining clusters, classifying a short-rendering row as full-width and skipping the trailing erase — the previous occupant's cells then scrolled into scrollback baked into the committed row. Non-ASCII row rewrites now erase the line before writing.

Changed

  • Rewrote the render core around an append-only native-scrollback contract. Committed rows are immutable: rows enter terminal history exactly once, in order, when the component-reported commit boundary (NativeScrollbackLiveRegion) marks them final, and the visible window repaints in place with relative moves. The engine no longer probes the terminal's scroll position or guesses whether a destructive rebuild is safe — the entire ED3-risk/defer/checkpoint machinery (viewport probes, eager streaming mode, dirty-scrollback reconciliation, deferred shrink/mutation intents, streaming high-water rebuilds, ConPTY-specific defer paths) is deleted. ED3 (CSI 3 J) now fires only on explicit user gestures: session replace, resize outside multiplexers, and resetDisplay(). This structurally removes the yank / flash / duplicated-rows / invisible-until-resize failure families tracked across #1610, #1635, #1651, #1682, #1719, #1746, #1799, #1823, #1962, #1974, #2000, #2011, #2154.
  • A frame that shrinks into its committed prefix re-anchors the visible window at the new tail and restarts commit bookkeeping; previously committed rows stay in history (history is never rewritten without a gesture).
  • Overlays now composite into the visible window slice only and freeze commits while visible, so overlay pixels can never enter native scrollback and closing an overlay no longer triggers a destructive history rebuild.
  • Inline-image budget demotion now deletes the demoted image's graphics by id and lets the window diff repaint the text fallback — no more mid-session destructive full replay when the image cap is exceeded.
  • The render-stress harness now validates the contract with a shadow commit ledger (an independent reimplementation of the ledger math fed only by observed frames and bytes), asserting scrollback equals the committed prefix row-for-row and that tape growth matches physical scroll exactly, across randomized op sequences, resizes, overlays, and multiplexer scenarios. The ghostty-web virtual terminal additionally survives libghostty-vt 0.4's WASM allocator traps via an event-log replay/compaction recovery, and strips non-spacing combining marks on input (a margin-aligned combining cluster deterministically corrupts that engine; mark placement through it was already unverifiable).

Removed

  • Removed the probe/defer API surface: TUI.setEagerNativeScrollbackRebuild(), TUI.refreshNativeScrollbackIfDirty(), TUI.setClearOnShrink()/getClearOnShrink(), RenderRequestOptions.allowUnknownViewportMutation, NativeScrollbackRefreshOptions, Terminal.isNativeViewportAtBottom(), Terminal.hasEagerEraseScrollbackRisk(), and the eagerEraseScrollbackRisk/submitPinsViewportToTail capability fields with their detectors.
  • Removed the PI_TUI_ED3_SAFE, PI_CLEAR_ON_SHRINK, and PI_TUI_DEBUG environment variables (the levers they tuned no longer exist; PI_DEBUG_REDRAW now logs the commit-ledger state per frame).

What's Changed

  • fix(tui): suppress tiny-title worker output by @roboomp in #2207
  • fix(agent): skip Claude Code custom agents in task discovery by @roboomp in #2210

Full Changelog: v15.10.9...v15.10.10