Release v0.5.0 · junghan0611/pi-shell-acp

Changed — pi-shell-acp session model lock

pi-shell-acp sessions are now locked to their starting model after the session starts. The lock has two layers:

pi-extensions/model-lock.ts is the primary UX guard. Once a conversation is anchored (agent_start, resume/fork, reload with messages, or startup with existing messages), model_select transitions that touch pi-shell-acp are immediately reverted to the previous model. This covers pi-shell-acp -> native, native -> pi-shell-acp, and pi-shell-acp/X -> pi-shell-acp/Y. Native-to-native switching remains free.
ensureBridgeSession is the bridge fallback/direct-call guard. If a live pi-shell-acp bridge session is asked to serve a different model, it throws ModelSwitchLockedError before closing the old ACP child, invalidating persisted state, or bootstrapping a new backend session.

Fresh startup/new sessions with no messages stay unlocked until the first prompt. Pre-turn model selector changes and CLI --model overrides are configuration, not violations. Resume/fork sessions lock immediately because their model identity was already anchored by the original session.

Wire-level evidence the bridge fallback matters. A live pi session that switched from Claude sonnet to Codex gpt-5.4 produced — before this change —

[pi-shell-acp:shutdown]  closeRemote=true invalidatePersisted=true closedRemote=ok childExit=exited
[pi-shell-acp:bootstrap] path=new backend=codex acpSessionId=019e2481-...

The Claude backend was reaped and a fresh Codex backend bootstrapped, while pi JSONL still pointed at the original Sonnet conversation. With the bridge fallback active, the same direct/reuse-path flow produces

[pi-shell-acp:model-switch] path=reuse outcome=locked
                            fromModel=claude-sonnet-4-6 toModel=gpt-5.4
                            reason=pi_shell_acp_session_locked_to_starting_model

— no shutdown line, no bootstrap path=new. The next prompt reuses the original ACP session (path=reuse backend=claude).

This is not transcript-clean. pi-core (AgentSession.setModel() in packages/coding-agent/src/core/agent-session.ts) mutates agent.state.model and calls appendModelChange() before the extension or provider boundary can refuse. Extension-side revert therefore leaves model_change as X -> Y -> X; bridge fallback leaves the attempted X -> Y record. A fully clean refusal requires a pi-core model-switch preflight/hook that this repo intentionally does not patch.

Surface changes

pi-extensions/model-lock.ts + package.json
- New extension-side model lock. It tracks when the session is anchored with session_start, agent_start, and existing message entries.
- startup / new with no messages: unlocked until first prompt.
- resume / fork: immediately locked.
- reload: preserves an already locked module state or reconstructs lock from existing message entries.
- Defensive fallback: if reading entries fails, lock rather than silently allowing a handoff.
- Reentry guard prevents loops when the extension calls pi.setModel(previousModel) to revert.
acp-bridge.ts
- New exported ModelSwitchLockedError carrying { sessionKey, fromBackend, toBackend, fromModel, toModel }.
- ModelSwitchOutcome type: "respawn" → "locked". The earlier "respawn" outcome is retired.
- ensureBridgeSession reuse-path mismatch (previously: close + invalidate persisted + startNewBridgeSession) now logs path=reuse outcome=locked reason=pi_shell_acp_session_locked_to_starting_model and throws ModelSwitchLockedError.
- The lock fires above isSessionCompatible so it catches same-backend AND cross-backend switches identically. An earlier prototype that lived inside the existingCompatible branch silently let cross-backend switches fall through to the incompatible-fallback and spawn a fresh session — the wire-level evidence above is exactly that hole.
- enforceRequestedSessionModel (bootstrap path) is unchanged. Bootstrap is the lifetime starting point, not a mid-life switch.
run.sh
- New check-model-lock deterministic gate. scripts/check-model-lock.ts covers the 18-case policy matrix: four provider quadrants, same-model no-op, pre-turn free selection, post-agent_start lock, resume/fork immediate lock, reload with entries, reload preserving prior lock, and defensive lock on entry-read failure.
- smoke-model-switch rewritten and generalized to four-argument form (backend_a model_a backend_b model_b). Three cases now run: within-backend Claude (sonnet → opus), within-backend Codex (gpt-5.4 → gpt-5.5), and cross-backend (Claude sonnet → Codex gpt-5.4). Pass criteria assert outcome=locked, exactly one [pi-shell-acp:bootstrap] path=new backend=<backend_a> line, ModelSwitchLockedError instanceof check, no outcome=respawn anywhere, no path=new backend=<backend_b> on cross-backend, and a successful post-refusal turn on the original session.
Docs
- AGENTS.md / README.md / VERIFY.md now describe the two-layer lock: extension-side revert as the normal path, bridge-side refusal as fallback, and the transcript-dirty caveat.

Scenarios covered by this guard

Fresh startup/new before the first prompt: free. This preserves CLI --model override and pre-turn model selector configuration.
After first prompt: any switch touching pi-shell-acp is reverted by the extension.
Resume/fork: locked immediately, even before the next prompt.
Reload: lock is preserved or reconstructed from existing message entries.
Native-to-native switches: free.
Direct bridge/reuse-path mismatch: refused by ensureBridgeSession. This is the fallback for direct calls or missing/failed extension coverage and prevents the silent-respawn hole.
Bootstrap-time model resolution (enforceRequestedSessionModel after new/resume/load): unaffected — bootstrap is the lifetime starting point, not a mid-life switch.
entwurf resume model override: already blocked separately by the Identity Preservation Rule (no model parameter on the entwurf resume surface).
Different-process reopen of a saved JSONL under a different --model: out of scope by design. Saved persistent records do not carry modelId; lock applies only to live bridge sessions in this process.

Migration

Operators who switched models mid-session by relying on the old respawn behavior must now open a new pi session for the new model once the current session is anchored. There is no in-process knob; this is the policy.
Tooling that grepped for outcome=respawn on the model-switch log line must look for outcome=locked instead. The legacy outcome value is gone; any occurrence in fresh logs after the upgrade is a regression signal.

Changed — 0.5.0 declaration: bridge does not implement compaction

The bridge no longer implements compaction. ACP backends compact natively; the pi session survives that. The bridge boundary stays explicit. This pays back the 0.4.x debt where both Claude (DISABLE_AUTO_COMPACT=1 + DISABLE_COMPACT=1) and Codex (-c model_auto_compact_token_limit=9223372036854775807) auto-compaction were disabled at the bridge surface — a deliberate, temporary expedient while the bridge surface was being shaped, now removed.

Layer	Default	Knob
pi JSONL compaction	blocked — pi-side summary does not reduce the backend transcript	`PI_SHELL_ACP_ALLOW_PI_COMPACTION=1` opts back in
backend-native compaction	always allowed (no bridge knob)	— configure the backend through its own native interface if you need to alter it; the bridge intentionally does not surface backend-specific compaction names
legacy `PI_SHELL_ACP_ALLOW_COMPACTION`	—	fail-fast at spawn intent with a next-action message pointing at `PI_SHELL_ACP_ALLOW_PI_COMPACTION`

Surface changes

acp-bridge.ts
- Claude bridgeEnvDefaults no longer ships DISABLE_AUTO_COMPACT / DISABLE_COMPACT at all. The adapter carries identity-isolation pins only (CLAUDE_CONFIG_DIR).
- Claude overlay settings.json now includes an explicit empty hooks: {} map. This keeps operator hooks hidden while matching the Claude SDK's configured-hooks shape; LIVE A/B probes showed that omitting the key made organic auto-compact consume the triggering turn for a meta-summary instead of answering the user prompt.
- Codex resolveCodexAcpLaunch no longer emits -c model_auto_compact_token_limit=9223372036854775807 at all. The bridge does not inject the threshold pin anywhere.
- resolveBridgeEnvDefaults(backend) returns the adapter's identity-isolation pins as-is — no compaction option, no filtering. The earlier disableBackendCompaction option, the isBackendCompactionDisabledByOperator() reader, the codexAutoCompactArgs() helper, the CODEX_DISABLE_AUTO_COMPACT_ARGS constant, and the COMPACTION_GUARD_ENV_KEYS filter set are all removed.
- resolveAcpBackendLaunch calls assertLegacyCompactionKnobUnset() on entry. Every spawn path (Claude, Codex, Gemini) crosses this surface, so the legacy single knob is rejected before any ACP child can launch on stale semantics. The error message points at PI_SHELL_ACP_ALLOW_PI_COMPACTION (the only remaining bridge knob) and tells the operator that backend-native compaction is always allowed — there is no longer a bridge knob to opt out.
- Identity-isolation env (CLAUDE_CONFIG_DIR, CODEX_HOME, CODEX_SQLITE_HOME, GEMINI_CLI_HOME, GEMINI_SYSTEM_MD) is unrelated to compaction and ships unconditionally — pinned at check-backends as a hard contract.
index.ts
- session_before_compact cancels by default and emits an honest message: "pi-side compact does not reduce the backend transcript; backend-native compaction is handled by the ACP backend itself; send /compact as a backend prompt or let the backend auto-compact". The PI_SHELL_ACP_ALLOW_PI_COMPACTION=1 opt-back-in path is documented in the same message.
run.sh
- check-backends assertions inverted to the 0.5.0 contract: default Codex launch must NOT contain model_auto_compact_token_limit; default Claude env must NOT contain DISABLE_AUTO_COMPACT/DISABLE_COMPACT; legacy PI_SHELL_ACP_ALLOW_COMPACTION=1 must throw at spawn intent. 137 assertions ok at this initial declaration. (See the 0.5.0 maintainer cleanup entry below for the post-cleanup count.)
- Organic compact path closed for Claude (2026-05-13) and Codex (2026-05-14). Initial Claude organic-context-full probes reproduced Claude SDK compaction on a saturated Sonnet session and showed the pi mapping survived, but also exposed a prompt-sacrifice failure when the Claude overlay omitted the hooks key. Adding hooks: {} fixed the turn shape: organic auto-compact now emits the compact status and then answers the triggering user prompt; explicit /compact still produces the expected compact-boundary turn and the next prompt answers from compacted context. Codex later passed both the lowered-threshold cheap stand-in and the real GPT-5.4 native-window saturation probe (used 244k → 84k, substantive compacting turn, sentinel preserved). The bridge still forwards backend output as-is and does not hydrate or rewrite transcript. Gemini context-pressure remains unverified.
- New ./run.sh smoke-compaction-policy [--step=NN] runner that wraps scripts/compaction-policy-smoke.ts. Originally six steps total (01/02/05 deterministic, 03/04/06 live); step 01 was retired in the later maintainer cleanup, leaving five steps with 02/05 forming the deterministic gate (no spawn, no network) and 03/04/06 the live release-evidence probe — under LIVE=1 they drive a real ACP child per backend via runEntwurfSync + runEntwurfResumeSync (same infrastructure as cross-cwd-resume-smoke), plant a unique sentinel, send literal /compact as a backend prompt (NOT pi-host /compact — entwurf delivers the string as a normal user message into the ACP child), then send a recall prompt and assert the sentinel survives. Same taskId across all three turns, so persisted-mapping reuse is also covered. The probe uses a dual-classifier for backend-compact evidence: a text classifier over the (b)-turn reply (compacted / summarized / context reduced) AND a wire classifier over the bridge stderr's [pi-shell-acp:usage] lines (explicit used=0 compact_boundary, or >=50% used drop). Pass requires positive evidence from EITHER classifier plus sentinel recall — survival alone is necessary but not sufficient. The dual shape exists because each backend signals compaction on a different ACP wire surface: codex-acp emits "Context compacted" in the assistant text, while claude-agent-acp suppresses the textual ack and posts an explicit used=0 synthetic usage_update via the SDK's compact_boundary event (acp-agent.js:477-498). Text-only or wire-only would mis-judge them; both run together and either suffices. Cost a few cents per backend. This is NOT a product surface — there is no user-facing /acp-compact command; the probe is release evidence, not a feature. Step 06 (Gemini) is exploratory — Gemini ACP does not advertise /compact and the probe records the actual observation, not a release claim. Step 05 verifies the wrapper throw directly (5a resolveAcpBackendLaunch) and verifies at source level that the production spawn entry (createBridgeProcess) carries the same assertLegacyCompactionKnobUnset() guard — bypass between the two paths was a reviewer-found regression and the smoke now guards against it.

Migration

PI_SHELL_ACP_ALLOW_COMPACTION=1 in 0.4.x meant two things at once: pi-side compact was allowed AND the backend guards were stripped (so backend-native compact could run). 0.5.0 keeps just the pi-side opt-in; backend-native compaction is now always allowed, so there is no second bridge knob.

0.4.x PI_SHELL_ACP_ALLOW_COMPACTION=1 → 0.5.0 PI_SHELL_ACP_ALLOW_PI_COMPACTION=1. Backend-native compaction is already allowed by default in 0.5.0 (no knob needed), so the only piece of the old broad semantic that still needs an opt-in is the pi-side one. Setting just ALLOW_PI_COMPACTION=1 reproduces the full 0.4.x ALLOW_COMPACTION=1 behavior.
If you need to alter a specific backend's auto-compaction, configure that backend through its own native interface. The bridge intentionally does not surface backend-specific compaction names; historical recipes are preserved below only as restoration context.
Bridge will refuse to spawn while PI_SHELL_ACP_ALLOW_COMPACTION=1 is still set. The throw at spawn intent names PI_SHELL_ACP_ALLOW_PI_COMPACTION and explains that backend-native compaction is now bridge-knob-free. No silent acceptance.

Docs

README §Compaction policy rewritten around the declaration; the backend-auto-compaction matrix row inverted; model_auto_compact_token_limit reference settings row updated; roadmap 0.5.0 line restated as declaration rather than guard split.
AGENTS / README Claude overlay notes now call out the explicit empty hooks: {} shape: operator hooks remain hidden, but Claude SDK organic compaction gets the configured-empty settings form that keeps the triggering turn clean.
VERIFY §1A.4 compaction-policy note rewritten; new 0.5.0 compaction policy evidence row at L3 backed by smoke-compaction-policy; the 0.4.x long-session fact-retention baseline annotated as needing a 0.5.0 re-baseline; cross-vendor §13 paragraph adjusted to reflect that the no-excuse-for-forgetting framing is 0.4.x-specific.
New demo/compaction-policy-smoke/README.md documenting the six-step surface (later updated to five — see the maintainer cleanup entry below).

Changed — 0.5.0 maintainer cleanup: backend-specific compaction knob references retired

After the 0.5.0 declaration ("bridge does not implement compaction") was validated end-to-end on 2026-05-14 — Codex Pattern A pass (LIVE step 04, our automated probe; cross-confirmed by GLG-direct agent-shell + pi-shell-acp + codex-acp dialogue), Codex Pattern B cheap-induction pass (lowered threshold; native auto-compact path reachable end-to-end through the bridge with sentinel preserved across two consecutive organic compacts), and Codex Pattern B real-saturation pass (default GPT-5.4 threshold, used 244k → 84k, substantive compacting turn, sentinel preserved) — the maintainer pass removed the remaining places where pi-shell-acp's code and operator-facing docs named backend-specific compaction knobs.

Reason — symmetry / consistency, not loss of knowledge. Knowing the names is itself an awareness of backend internals and inconsistent with the bridge thesis. Even a negative assertion ("our argv must NOT contain X") presumes we know X exists, and an operator-facing recipe ("for Codex inline -c X=… via Y") teaches an asymmetric "how to disable compact per backend" hint that quietly re-anchors the bridge as something that owns the compaction concern. The 0.4.x→0.5.0 transition needed those strings while the policy was being shaped and verified. Once the policy is verified, they are debt.

Removed

scripts/compaction-policy-smoke.ts step 01 (spawn intent has no backend compaction guard). The step's negative assertion enumerated DISABLE_AUTO_COMPACT, DISABLE_COMPACT, and model_auto_compact_token_limit directly. LIVE steps 03/04/06 cover the same regression surface — if the bridge ever re-injects a backend-side compaction guard, backend-native compaction stops working end-to-end and those live probes turn red. ALL_STEPS is now ["02","03","04","05","06"]; REGISTRY drops the "01" entry; the import of resolveBridgeEnvDefaults (only used by step 01) is dropped from the smoke driver.
run.sh check-backends: the explicit assert.ok(!codexLaunch.args.some(arg => arg.includes('model_auto_compact_token_limit')), ...) line was removed. The deepEqual against the expected argv list is the single source of truth — anything not in that expected list is not pinned. The Claude env assertions were also generalized from two exact compaction-name negative assertions to one identity-isolation key-set assertion, paired with the same key-set assertion for Codex. Count remains 136 after the maintainer cleanup.
acp-bridge.ts inline comments at resolveCodexAcpLaunch, the codex overlay TOML header, and the codex env block: generalized from "bridge does not pin model_auto_compact_token_limit" to "bridge does not pin any codex-side compaction knob". Behavior unchanged; only the comment surface stopped naming codex internals.
README.md "Operating-surface contract — Codex backend" table: the model_auto_compact_token_limit row was removed (the bridge does not pin it, and the row's only operator-facing content was a per-backend recipe — which is precisely what the cleanup retires).
README.md, AGENTS.md, CONTRIBUTING.md, VERIFY.md compaction-policy paragraphs: the "for Claude DISABLE_AUTO_COMPACT=1 … for Codex inline -c model_auto_compact_token_limit=… via CODEX_ACP_COMMAND" recipe collapsed to "configure that backend through its own native interface — the bridge intentionally does not surface backend-specific compaction names".
demo/compaction-policy-smoke/README.md: "Six steps" → "Five steps" with an explicit retirement note for step 01; the backend-specific recipe paragraph generalized.

Restoration recipe

If a future need ever requires reintroducing per-backend guard awareness — for a regression test, for a release-evidence probe targeting a specific backend behavior, or because a backend changes its compaction semantics in a way that defeats live-probe detection — the historical source is this CHANGELOG itself. Earlier entries in this 0.5.0 release block (above) still name the exact backend-specific strings:

acp-bridge.ts resolveCodexAcpLaunch "no longer emits -c model_auto_compact_token_limit=9223372036854775807" — codex argv guard.
check-backends "default Codex launch must NOT contain model_auto_compact_token_limit; default Claude env must NOT contain DISABLE_AUTO_COMPACT/DISABLE_COMPACT" — both guard names.
Migration "If you need a specific backend's auto-compaction off, export the backend's own native env/argv from your shell (DISABLE_AUTO_COMPACT=1 for Claude; for Codex, inline -c model_auto_compact_token_limit=… via CODEX_ACP_COMMAND, or export CODEX_HOME)" — the recipe shape.

These history entries are intentionally left in place. The retirement is a thesis-alignment choice, not a loss of knowledge.

Not removed

Identity-isolation env carriers (CLAUDE_CONFIG_DIR, CODEX_HOME, CODEX_SQLITE_HOME, GEMINI_CLI_HOME, GEMINI_SYSTEM_MD) keep their per-backend names. They are unrelated to compaction; they are the bridge's identity/overlay surface, which is per-backend by design.
LIVE step 04's PI_ENTWURF_ACP_FOR_CODEX=1 env extras and the Codex/Claude probe-time references inside scripts/compaction-policy-smoke.ts remain — those are spawn-routing and live-probe surfaces, not bridge-side compaction policy.
The 0.4.x→0.5.0 transition fact entries above stay as-is. They are the restoration source.

Evidence

demo/compaction-policy-smoke/probes/2026-05-14-codex-step04-A/ — Pattern A pass (explicit /compact; text + sentinel signal).
demo/compaction-policy-smoke/probes/2026-05-14-codex-B-threshold/ — Pattern B cheap stand-in (lowered-threshold organic auto-compact, sentinel preserved across two consecutive compacts, bridge mapping survives).
demo/compaction-policy-smoke/probes/2026-05-14-codex-B-saturation/ — Pattern B real native-window saturation (13 turns drove used 17k → 244k ≈ 94.5% on GPT-5.4; codex-rs native default auto_compact_token_limit fired organic auto-compact on turn 12, wire used 244089 → 84549 = 65% drop crossing the 50% classifier threshold; substantive 982-word answer in the compact turn; post-compact sentinel recall preserved; bridge mapping intact across all 13 turns). Codex GPT-5.4 native threshold ≈ 245k versus Claude Sonnet 4.6 ≈ 120k — same probe shape, honestly asymmetric backend defaults, same thesis.

Gemini axis closed as an honest ACP asymmetry, not as a pass (5/14, evidence triangulated across source, native CLI cross-check, and PM sibling review):

ACP command registry source: gemini-cli/packages/cli/src/acp/acpCommandHandler.ts:23-31 registers memory, extensions, init, restore, about, help only. compress/compact/summarize are NOT in the ACP registry. CLI body (packages/cli/src/ui/commands/compressCommand.ts:10-13) implements compress with aliases summarize, compact — but this is a TUI-only surface.
Organic compression on ACP path: gemini-cli/packages/core/src/core/client.ts:673-677 — every turn start calls tryCompressChat(prompt_id, false); on success it yields GeminiEventType.ChatCompressed. But gemini-cli/packages/cli/src/acp/acpSession.ts switch has no ChatCompressed case → default: break silently drops the event. Compression may happen, but the ACP wire never sees it.
Context-pressure final surface: if compression is insufficient, ContextWindowWillOverflow → acpSession.ts:369-371 → stopReason: 'max_tokens'.
GLG direct CLI cross-check (5/14): Native Gemini CLI /compress reduced 93620 → 12936 tokens in a real session, confirming the CLI mechanism is real and works outside ACP. The asymmetry — /compress exists, but only outside ACP — is recorded as honest negative, not paved over.
PM sibling review (gpt-5.5 medium, 5/14): explicitly corrected an earlier "Gemini axis closed" framing to "closed as honest ACP asymmetry, not as a pass". The release-grade phrasing committed: "Native Gemini CLI supports /compress (alias /compact, /summarize), but Gemini ACP does not expose that command. Organic compression may happen inside Gemini CLI, but ACP does not surface ChatCompressed on the wire today. If pressure remains, ACP surfaces max_tokens. pi-shell-acp does not inject backend-specific Gemini compression knobs."
No LIVE saturation probe for Gemini: Gemini Pro 1M+ window saturation is cost-disproportionate (Codex 258k probe was already at the upper end of cheap), and inducing compression by injecting Gemini-specific knobs (compressionThreshold, contextManagement) into the overlay would violate the 0.5.0 maintainer cleanup thesis (bridge does not surface backend-specific compaction names). Source + native CLI cross-check + PM review is the release-grade evidence chain here.
Operator-facing UX at max_tokens: "Gemini ACP reached context pressure; native CLI has /compress but ACP does not expose it here. Start a fresh session or reduce context."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Changed — pi-shell-acp session model lock

Surface changes

Scenarios covered by this guard

Migration

Changed — 0.5.0 declaration: bridge does not implement compaction

Surface changes

Migration

Docs

Changed — 0.5.0 maintainer cleanup: backend-specific compaction knob references retired

Removed

Restoration recipe

Not removed

Evidence

Uh oh!