Skip to content

v16.0.5

Choose a tag to compare

@github-actions github-actions released this 17 Jun 13:48
· 580 commits to main since this release
7251ca1

@oh-my-pi/pi-agent-core

Breaking Changes

  • Changed AgentOptions.getApiKey and AgentLoopConfig.getApiKey to receive the active Model and return an API key or ApiKeyResolver, so credential routing stays model-scoped and retry context is no longer exposed through the agent-core API

Added

  • Added agent-loop deadline support for graceful wall-clock session stops.

Changed

  • Changed Gemini repetition-loop detection to live in the pi-ai stream layer instead of the agent loop. The agent no longer runs its own Gemini-gated verbatim repetition check (detectRepetition/truncateRepetition); loops now surface as a retryable transient stream error that the standard auto-retry path discards and re-samples, rather than a committed contentful error message.

Fixed

  • Fixed PI_DIALECT=minimax being ignored by the owned tool-calling env selector. (#2759)

@oh-my-pi/pi-ai

Added

  • Added antigravityEndpointMode stream option with auto, production, and sandbox values to control Antigravity endpoint routing
  • Added seedApiKeyResolver for reusing a pre-resolved request key while preserving resolver-driven auth retry and credential rotation
  • Added optional contextSnapshot property to AssistantMessage with token usage metadata via new ContextSnapshot interface (promptTokens, nonMessageTokens, and optional lastMessageTimestamp)
  • Added LITELLM_BASE_URL guidance to the LiteLLM login prompt so non-default proxy endpoints are discoverable. (#2726)
  • Added a Gemini thinking-loop guard that watches streamed thinking deltas for degenerate reasoning loops — verbatim tail repetition and near-duplicate paragraph cycling — and terminates the stream with a retryable, empty-content error message (worded as a transient stream stall) so the turn is discarded and re-sampled instead of committing a runaway transcript. Gated to Gemini models across every transport (OpenRouter, direct Google, Vertex) and disarmed once visible answer text or a tool call starts; disable with PI_NO_THINKING_LOOP_GUARD=1.

Changed

  • Changed the Antigravity (google-antigravity) request builder to mirror the captured antigravity/hub client: gemini-3.x send thinkingConfig.thinkingBudget per tier, a fixed per-model maxOutputTokens, a default functionCallingConfig.mode: "VALIDATED" tool mode (auto/unset tool choice only), a role: "user" system instruction, a structured requestId (agent/<id>/<ts>/<trajectoryId>/<step>), and labels (model_enum, trajectory_id, last_step_index, last_execution_id, used_claude*) tracked across the conversation via provider session state.

Fixed

  • Fixed Gemini usage-tier mapping so gemini-3.5-flash is treated as Flash and gemini-3.1-pro plus gemini-pro-agent are treated as Pro in usage accounting
  • Fixed Antigravity stream state handling so a request’s last_execution_id is committed only after a successful completion and cleared between retry attempts
  • Fixed streamSimple() Gemini streams to run through the thinking-loop guard for custom API and pi-native transports, so degenerate thinking loops now abort with the same retryable empty-content error path as other Gemini stream paths
  • Fixed Antigravity model streaming and usage fetch paths to retry on transient 429/5xx errors by failing over to the alternate endpoint before surfacing an error
  • Fixed Antigravity endpoint tracking to prefer a previously successful endpoint in auto mode for subsequent requests
  • Fixed Antigravity and Gemini CLI model requests failing with an opaque error when Google requires account verification. Cloud Code Assist 403 VALIDATION_REQUIRED responses now surface the validation_url and the signed-in account email when available, so users see an actionable account-verification message instead of the raw API error body.
  • Fixed MiniMax M3 in-band tool calls by adding a MiniMax dialect that parses <minimax:tool_call> wrappers instead of falling back to generic XML. (#2759)
  • Fixed GitHub Copilot OAuth for Business seats by storing the login-discovered API endpoint and routing model enablement plus chat requests to that endpoint. (#2876)

@oh-my-pi/pi-catalog

Added

  • Added enableGeminiThinkingLoopGuard to OpenAI compatibility options to allow explicit opt-in or opt-out of the Gemini thinking-loop guard for OpenAI-compatible model aliases
  • Added LITELLM_BASE_URL as the LiteLLM provider discovery base URL fallback, with discovery caches scoped by the resolved proxy URL and explicit provider baseUrl config kept at higher precedence. (#2726)
  • Added ThinkingConfig.effortBudgets (per-effort thinking-budget contract baked into collapsed variants) and ANTIGRAVITY_MODEL_WIRE_PROFILES (maxOutputTokens + model_enum per Antigravity wire id) to mirror the captured Antigravity Cloud Code Assist client request shape.

Changed

  • Defaulted enableGeminiThinkingLoopGuard from Gemini family detection for both OpenAI completions and responses compatibility specs so Gemini models now enable the thinking-loop guard automatically
  • Updated the default Gemini CLI user-agent version fallback to 0.46.0.
  • Changed the Antigravity (google-antigravity, daily-cloudcode-pa) gemini-3.x collapse families to the budget thinking transport with the client's per-tier thinkingBudget (3.5 Flash low/medium/high = 1000/4000/10000, 3.1 Pro low/high = 1001/10001) and corrected 3.5 Flash effort→wire routing (medium → gemini-3.5-flash-low, high → gemini-3-flash-agent). Split the shared CCA collapse table so google-gemini-cli (cloudcode-pa) keeps the google-level thinkingLevel transport for official Gemini CLI parity. Stale collapsed snapshots (bundled catalog, recycled gemini-3-flash alias) self-heal from the hand table at collapse time, and the model cache schema is bumped to v7 to invalidate pre-budget Antigravity rows.
  • Changed the Antigravity user-agent to the antigravity/hub/<version> format (default 2.1.4) to match the captured client.

Fixed

  • Fixed off effort routing for claude-opus-4-5 and claude-opus-4-6 to use their base model IDs when thinking is disabled
  • Fixed gemini-2.5-flash effort routing so all non-off effort levels resolve to gemini-2.5-flash-thinking
  • Fixed shared variant alias provider resolution so resolveBareVariantAlias reports all matching providers when model aliases are present in both CCA collapse tables
  • Routed google-antigravity default baseUrl to the stable primary daily endpoint in the catalog generator and all fallback snapshots, resolving connection drops on heavy queries.
  • Fixed MiniMax M3 dialect selection so MiniMax-family OpenAI-compatible models use the MiniMax tool-call dialect instead of generic XML. (#2759)
  • Fixed GitHub Copilot dynamic discovery to honor plan-specific API endpoints stored in structured OAuth credentials. (#2876)

@oh-my-pi/pi-coding-agent

Added

  • Added tui.tight setting (default false) to enable tight layout by removing the 1-character horizontal padding from terminal output.
  • Added a providers.antigravityEndpoint setting (auto, production, sandbox) to control google-antigravity routing for chat, search, image, and discovery calls
  • Added automatic endpoint-mode support for google-antigravity provider calls so users can force production-only or sandbox-only usage
  • Added images.describeForTextModels option (default true) to control automatic image description for attachments sent to models without vision input
  • Added automatic vision fallback prompts to describe images for text-only models
  • Added advisor.immuneTurns setting (default 1) to limit how often advisor concern/blocker notes can interrupt the primary agent.
  • Added a main-session session_stop extension event with continuation feedback and an 8-continuation loop cap (#2834).
  • Added --max-time <seconds> so CLI sessions can stop after a wall-clock deadline.

Changed

  • Changed google-antigravity usage report lookups to honor the selected antigravity endpoint mode when resolving the reporting base URL
  • Changed context usage reporting to always return numeric token counts and percentages, so status-line and footer now show estimated values instead of ? immediately after compaction
  • Changed context usage reporting to use anchored snapshots and pending-prompts estimates, which now keeps /context, status line, and model selector token counts in sync

Fixed

  • Fixed Matplotlib figure display to emit PNG output immediately when display(fig) is called, even if the figure is closed before the end-of-cell flush
  • Fixed persisted tool-result image payloads in details.images to externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation
  • Fixed duplicate Matplotlib image output by skipping the automatic end-of-request figure flush for figures that were already displayed through display(fig)
  • Fixed google-antigravity image generation and web search requests to fail over to the alternate antigravity endpoint on 429/server/network failures instead of stopping at the first endpoint
  • Fixed context usage breakdown to use a completed assistant usage anchor from the current turn instead of a pending prompt snapshot so totals no longer overcount when a large in-turn tool step returns usage
  • Fixed side-channel turns and advisor requests to keep using credential resolvers during retries, so Google Resource exhausted 429s can rotate to the next account instead of surfacing a terminal error banner
  • Fixed context token accounting to keep branch-local anchors during branching so sibling-branch messages no longer pollute context estimates
  • Fixed context usage consistency so /context, status line, and idle compaction logic now report the same used-token totals
  • Fixed status-line context cache invalidation when assistant reasoning signature data grows so displayed context usage updates accurately
  • Fixed the status-line context% reading inflated during long tool turns and then dropping sharply on the next message even though no compaction ran. While a request was in flight getContextBreakdown summed a cl100k estimate of the entire tail on top of the stale turn-start prompt and never re-anchored to completed in-turn steps; it now prefers the real provider prompt-token count of any step that resolves at or after the pending cutoff. The status-line memo also keys on a contextUsageRevision that bumps when the in-flight snapshot is set/cleared, so a mid-turn estimate is invalidated on turn end/abort instead of surviving into idle until the next message
  • Fixed image attachment handling for text-only models by saving attachments to local:// and injecting generated descriptions so they are no longer lost when the target model cannot process images
  • Fixed the ssh tool rejecting valid Windows identity files before invoking OpenSSH by skipping Unix mode-bit key validation on native Windows (#2850).
  • Fixed web_search/omp q aborting before any provider ran when the global Settings singleton was not initialized; executeSearch now reads providers.antigravityEndpoint once and tolerates an uninitialized settings store instead of throwing
  • Fixed the new git.enabled and images.describeForTextModels settings declaring section groups (Git, Vision) that were not registered in TAB_GROUPS, so they now render in their intended settings-panel sections
  • Fixed Python display(fig) for Matplotlib figures to emit PNG output immediately, even when user code closes the figure before the end-of-cell flush.
  • Fixed persisted tool-result image payloads stored in details.images to externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation.
  • Fixed the tools.format setting schema so minimax can be selected as an owned tool-calling dialect, and taught auto mode to route tool-less MiniMax-family models to the MiniMax owned dialect. (#2759)
  • Fixed WSL2 TUI stutter by adding a git.enabled setting and skipping footer/status-line git probes when disabled or when no git-backed status segment is visible (#2847).
  • Fixed JSON-mode startup notices (export/resume/session-picker messages) writing to stdout before the JSON event stream; they now route to stderr so stdout remains newline-delimited JSON.

@oh-my-pi/collab-web

Fixed

  • Preserved assistant soft line breaks and Markdown paragraph/list indentation in the collab web transcript renderer so tree-shaped prose no longer collapses into one paragraph.
  • Changed collab web transcript wrapping to keep Korean/CJK words intact before falling back to emergency breaks for long URLs or identifiers.

@oh-my-pi/pi-mnemopi

Fixed

  • Capped sleep_consolidation episodic rows at maxEpisodeChars (default 100KB, MNEMOPI_MAX_EPISODE_CHARS) so raw session transcripts cannot be stored and extracted as multi-megabyte episodes. (#2869)
  • Skipped regex-only entity and pattern fact extraction for oversized raw transcripts so progress/log noise cannot flood MEMORIA with junk facts. (#2868)

@oh-my-pi/omp-stats

Added

  • New Projects view summarizing usage, cost, and reliability per project folder (backed by the existing /api/stats/folders endpoint).
  • System-aware light/dark theme toggle — follows the OS by default, and an explicit choice persists across reloads.

Changed

  • Redesigned the local stats dashboard with an OMP-themed product shell, dedicated per-section views, accessible loading/empty/error states, and flicker-free navigation between screens and time ranges.

Fixed

  • The 1h time-range chart rendered an empty/single-point line; it now buckets at 5-minute granularity for a real trend.

@oh-my-pi/pi-tui

Added

  • Added tight layout support (setTuiTight/getPaddingX) to dynamically remove 1-character horizontal padding from Text, Markdown, Box, and TruncatedText components.

Changed

  • Coalesced byte-adjacent SGR sequences in emitted lines into a single CSI … m. The component tree styles each span as <set>text<reset>, so adjacent spans emit runs of back-to-back SGR sequences (e.g. a CSI 39 m fg-reset immediately followed by the next span's CSI 38;2;r;g;b m); merging the run is behavior-preserving because SGR parameters apply left-to-right regardless of framing. On a real transcript this drops ~30-40% of all SGR sequences, cutting the per-frame byte volume and SGR-dispatch count a slow terminal engine (e.g. xterm.js/WebGL under a large viewport) must process. Each emitted sequence is capped at 16 parameter tokens so a long adjacent run is split across several valid CSIs instead of overflowing a terminal's parameter buffer (xterm.js caps at 32 and silently truncates, corrupting colors). A run is never extended past a parameter list that ends in an incomplete semicolon-form extended color (38/48/58;2 missing a channel or ;5 missing the index), so a following code can't be absorbed as the missing component. Disable with PI_NO_SGR_COALESCE=1.

Fixed

  • Fixed image cache invalidation when terminal image protocol, Kitty placeholder mode, or cell dimensions change, preventing stale rendered output
  • Fixed direct inline-image placements leaving the cursor inside the reserved image block, which let following chat rows overwrite the middle of rendered screenshots (#2863).
  • Fixed inline-image replay after startup or resume fallback paints by invalidating cached image rows when the terminal image protocol, Kitty placeholder mode, or cell dimensions change.

What's Changed

  • fix(catalog): route google-antigravity default baseUrl to primary daily endpoint by @cagedbird043 in #2860
  • fix(scripts): check current Gemini CLI version source by @lyc-aon in #2843
  • feat(stats): redesign the omp stats dashboard by @lyc-aon in #2841
  • perf(tui): coalesce byte-adjacent SGR sequences in emitted lines by @DarkPhilosophy in #2848
  • Keep JSON mode stdout clean during startup by @usr-bin-roygbiv in #2813
  • feat(alibaba-coding-plan): add endpoint selection for China/International/Custom by @21307369 in #2802
  • fix(ai): surface Google OAuth account-verification URL on VALIDATION_… by @igasmi in #2806
  • fix(litellm): support LITELLM_BASE_URL by @alexanderkirilin in #2816
  • fix(collab-web): align transcript wrapping with TUI by @chan1103 in #2638
  • fix(minimax): add MiniMax tool-call dialect by @alexanderkirilin in #2817
  • Add session_stop extension stop hook by @cexll in #2845
  • Add max-time deadline support by @usr-bin-roygbiv in #2815
  • fix(mnemopi): cap sleep consolidation episodes by @roboomp in #2873
  • fix(mnemopi): guard regex extraction for large transcripts by @roboomp in #2874
  • fix(auth): route GitHub Copilot Business endpoint by @roboomp in #2881

New Contributors

Full Changelog: v16.0.4...v16.0.5