fix(gateway): hide transcript-only history artifacts by dev111-actor · Pull Request #79172 · openclaw/openclaw

dev111-actor · 2026-05-08T01:23:37Z

Summary

Problem: chat.history and sessions.get could surface transcript-only OpenClaw assistant rows (delivery-mirror / gateway-injected) as normal assistant messages.
Why it matters: those artifacts are durable transcript/audit rows, but consumer history should not display or return them as model output, and hidden rows should not consume the visible history limit.
What changed: added a shared transcript-artifact predicate, dropped those rows in chat display projection, over-read bounded history tails before visible limiting, and filtered sessions.get after the same over-read.
What did NOT change (scope boundary): raw transcript writes, provider replay cleanup, channel routing, and broader Umbrella: duplicate transcript, replay, and context assembly across channels #69208 idempotency/bootstrap tracks are unchanged.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related Umbrella: duplicate transcript, replay, and context assembly across channels #69208
Related fix: filter delivery-mirror from all consumer paths (LLM context, webchat, API) #40716
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

Behavior or issue addressed: transcript-only OpenClaw assistant artifacts should be hidden from chat.history and sessions.get, with limits applied after hidden rows are removed.
Real environment tested: Ubuntu 24.04 workspace, Node via repo nvm/pnpm, real OpenClaw gateway test harness exercising WS chat.history, sessions.get, and HTTP/SSE session history paths against JSONL transcript files.
Exact steps or command run after this patch: pnpm test src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts src/gateway/server.sessions.store-rpc.test.ts src/gateway/sessions-history-http.test.ts; pnpm exec oxfmt --check --threads=1 src/config/sessions/transcript-artifacts.ts src/gateway/chat-display-projection.ts src/gateway/server-methods/chat.ts src/gateway/server-methods/sessions.ts src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts src/gateway/server.sessions.store-rpc.test.ts src/gateway/sessions-history-http.test.ts docs/gateway/protocol.md CHANGELOG.md; git diff --check; pnpm check:changed.
Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal output: Test Files 3 passed (3); Tests 104 passed (104) and Test Files 1 passed (1); Tests 12 passed (12) from the targeted gateway run; copied terminal output: All matched files use the correct format.; copied terminal output: pnpm check:changed ... node scripts/check-changed.mjs exited 0; copied terminal output: git diff --check exited 0.
Observed result after fix: regression tests prove chat.history and sessions.get return the older visible user row plus the real assistant row when the raw tail also contains hidden delivery-mirror and gateway-injected rows, preserving visible limit behavior and raw transcript sequence metadata.
What was not tested: a full live provider/channel delivery run with real Telegram/Control UI credentials, and the full repository pnpm check / pnpm test sweeps.
Before evidence (optional but encouraged): source proof on current main showed chat.history reading only the requested raw tail before projection and sessions.get returning raw recent messages without filtering transcript-only OpenClaw assistant artifacts.

Root Cause (if applicable)

Root cause: transcript-only OpenClaw assistant rows are persisted in the same JSONL message stream as real provider assistant output, but the history/API consumers did not consistently treat those rows as internal artifacts.
Missing detection / guardrail: there was no gateway history regression that combined both artifact models with a small visible history limit.
Contributing context (if known): provider replay already drops these artifacts, so this gap was limited to consumer history/API presentation paths on current main.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/gateway/server-methods/server-methods.test.ts, src/gateway/server.chat.gateway-server-chat-b.test.ts, src/gateway/server.sessions.store-rpc.test.ts, src/gateway/sessions-history-http.test.ts.
Scenario the test should lock in: raw transcript tails containing provider:"openclaw" with model:"delivery-mirror" or model:"gateway-injected" still produce the requested visible chat.history / sessions.get rows after filtering.
Why this is the smallest reliable guardrail: it exercises the shared projection helper plus the two public consumer handlers without needing a real provider response.
Existing test that already covers this (if any): replay history tests covered provider replay, but not these history/API consumers.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

chat.history and sessions.get no longer return transcript-only OpenClaw assistant artifacts as visible assistant messages. Requested history limits now apply to visible rows after those artifacts are removed.

Diagram (if applicable)

Before:
[raw transcript tail] -> [delivery-mirror/gateway-injected rows included] -> [visible history limit may be consumed]

After:
[raw transcript tail] -> [drop transcript-only OpenClaw assistant artifacts] -> [limit visible rows] -> [consumer-safe history]

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Ubuntu 24.04
Runtime/container: repo Node/pnpm environment
Model/provider: transcript fixtures using OpenClaw artifact rows plus Anthropic-shaped visible assistant rows
Integration/channel (if any): Gateway WS/HTTP history surfaces
Relevant config (redacted): default gateway test harness config

Steps

Seed a session JSONL with a visible user row, provider:"openclaw", model:"delivery-mirror", provider:"openclaw", model:"gateway-injected", and a real assistant row.
Request chat.history or sessions.get with limit: 2.
Verify the two returned rows are the visible user row and the real assistant row, not either OpenClaw artifact.

Expected

Transcript-only OpenClaw assistant artifacts are omitted and do not consume the visible history limit.

Actual

Matches expected after this patch.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: focused gateway tests for display projection, WS chat.history, RPC sessions.get, and HTTP/SSE session history behavior; formatting; whitespace; changed gate.
Edge cases checked: both delivery-mirror and gateway-injected artifact models; visible limit applied after hidden rows; raw sequence metadata preserved across hidden rows.
What you did not verify: full live channel cross-delivery with real external credentials, and full repository sweeps.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: callers that expected raw sessions.get artifacts may see fewer rows.
- Mitigation: this matches the existing consumer-boundary intent, keeps raw JSONL unchanged for audit/debugging, and over-reads before filtering so visible history limits stay useful.

clawsweeper · 2026-05-08T01:26:43Z

Codex review: needs changes before merge.

Summary
The PR adds a shared transcript-artifact predicate, filters delivery-mirror and gateway-injected OpenClaw assistant rows from chat.history and sessions.get, and updates gateway tests, protocol docs, and the changelog.

Reproducibility: yes. Source inspection shows current main returns raw recent delivery-mirror/gateway-injected rows from chat.history and sessions.get, and the patch's new predicate would also drop chat.inject messages because they use the same gateway-injected marker.

Real behavior proof
Sufficient (live_output): The PR body supplies copied after-fix terminal output from real gateway harness tests for the changed history surfaces, though that proof does not cover the chat.inject regression found in review.

Next step before merge
There is a narrow, source-backed repair: keep the consumer-history filter but avoid hiding visible gateway-injected messages used by chat.inject and non-agent WebChat replies.

Security
Needs attention: The diff can hide persisted operator/gateway-injected assistant content from normal history views, creating an audit-visibility regression.

Review findings

[P2] Do not hide every gateway-injected assistant row — src/config/sessions/transcript-artifacts.ts:1

Review details

Best possible solution:

Land a rebased consumer-history filter that preserves raw transcripts and hides true internal artifacts, while distinguishing visible gateway-injected messages from transcript-only rows before filtering.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main returns raw recent delivery-mirror/gateway-injected rows from chat.history and sessions.get, and the patch's new predicate would also drop chat.inject messages because they use the same gateway-injected marker.

Is this the best way to solve the issue?

No, not as written. Filtering at the shared Gateway history boundary is the right shape, but the predicate must not treat every gateway-injected assistant row as hidden because current code uses that marker for visible durable assistant messages too.

Full review comments:

[P2] Do not hide every gateway-injected assistant row — src/config/sessions/transcript-artifacts.ts:1
This marks all model: "gateway-injected" assistant messages as transcript-only. Current chat.inject appends messages with that exact marker and immediately calls projectChatDisplayMessage(appended.message) before broadcasting; with this predicate, the projected message becomes undefined, and the persisted injected/non-agent WebChat final reply also disappears from later history reads. Please add a narrower discriminator for truly internal injected tails, or keep visible gateway-injected rows out of this filter.
Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.88

Security concerns:

[medium] Preserve audit visibility for injected assistant rows — src/config/sessions/transcript-artifacts.ts:1
The new predicate omits every gateway-injected row from history surfaces, but current main uses that marker for chat.inject and non-agent WebChat final replies that are deliberately persisted and broadcast as visible assistant messages.
Confidence: 0.86

Acceptance criteria:

node scripts/run-vitest.mjs src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts src/gateway/server.sessions.store-rpc.test.ts src/gateway/sessions-history-http.test.ts src/gateway/server-methods/chat.directive-tags.test.ts
node scripts/crabbox-wrapper.mjs run --shell -- "pnpm check:changed"

What I checked:

Current chat.history leak path: Current main reads only the requested raw tail with readRecentSessionMessagesAsync(... maxMessages: max ...) before display projection, and the display projection does not drop delivery-mirror or gateway-injected artifacts. (src/gateway/server-methods/chat.ts:1754, ea16a5e9e10c)
Current sessions.get leak path: Current main returns { messages } directly from readRecentSessionMessagesWithStatsAsync, so transcript-only OpenClaw assistant rows are not filtered from the sessions API. (src/gateway/server-methods/sessions.ts:2063, ea16a5e9e10c)
Existing replay contract: Provider replay already treats provider:"openclaw" assistant rows with model:"delivery-mirror" or model:"gateway-injected" as transcript records rather than provider model output, supporting a consumer-boundary filter for true internal artifacts. (src/agents/pi-embedded-runner/replay-history.ts:230, ea16a5e9e10c)
Patch predicate over-matches gateway-injected: The PR classifies every gateway-injected assistant row as transcript-only, which is the changed line that creates the review finding. (src/config/sessions/transcript-artifacts.ts:1, dbcdb380d571)
Gateway-injected rows are also visible durable messages: Current main uses gateway-injected for visible non-agent WebChat final replies and chat.inject transcript appends, then projects the appended message for immediate broadcast. (src/gateway/server-methods/chat.ts:2562, ea16a5e9e10c)
Existing chat.inject regression coverage: Current tests assert chat.inject broadcasts a defined final message, which would fail if the new projection drops the just-appended gateway-injected message. (src/gateway/server-methods/chat.directive-tags.test.ts:1043, ea16a5e9e10c)

Likely related people:

steipete: GitHub path history ties this area to unified chat display projection work and recent replay-control pruning, and related comments scoped the Track B assistant-artifact filtering work. (role: recent area contributor / artifact contract owner; confidence: high; commits: 5f2273e81efc, daef8e73fc92, 64d4f99d2641; files: src/gateway/chat-display-projection.ts, src/agents/pi-embedded-runner/replay-history.ts, src/gateway/server-methods/chat.ts)
BunsDev: Authored recent merged chat display projection work preserving visible assistant text while hiding internal/commentary rows, directly adjacent to this PR's display-filter change. (role: recent projection contributor; confidence: medium; commits: 3110c621df14; files: src/gateway/chat-display-projection.ts, src/gateway/server-methods/server-methods.test.ts, src/gateway/server.chat.gateway-server-chat-b.test.ts)
BradGroux: Authored the related umbrella and a closed superseded PR for this exact transcript-only OpenClaw assistant artifact history slice, clarifying scope boundaries for the current PR. (role: umbrella tracker / prior slice author; confidence: medium; commits: a4a2492d6483; files: src/gateway/server-methods/chat.ts, src/gateway/server-methods/sessions.ts, src/gateway/sessions-history-http.test.ts)
vincentkoc: Commented that a narrow repair had been pushed to the earlier canonical branch for the same assistant-artifact consumer-path work before that branch was superseded. (role: prior canonical PR repair owner; confidence: medium; commits: 3b99a1a159da; files: src/gateway/server-methods/chat.ts, src/gateway/server-methods/sessions.ts, src/config/sessions/transcript.ts)

Remaining risk / open question:

The branch is merge-conflicting with current main, which has recent Gateway/session-history refactors that final validation must cover after rebase.
Current gateway-injected rows do not carry a discriminator separating visible injected assistant messages from truly internal transcript-only tails.

Codex review notes: model gpt-5.5, reasoning high; reviewed against ea16a5e9e10c.

fix(gateway): hide transcript-only history artifacts

dbcdb38

openclaw-barnacle Bot added docs Improvements or additions to documentation app: web-ui App: web-ui gateway Gateway runtime size: M proof: supplied External PR includes structured after-fix real behavior proof. labels May 8, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

hclsys mentioned this pull request May 9, 2026

sessions.get returns wrapper-only transcript while chat.history returns full conversation; sessionFile field is misleading #79854

Open

clawsweeper Bot mentioned this pull request May 12, 2026

chat.history leaks system-level memory injection blocks to WebChat UI #64613

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gateway): hide transcript-only history artifacts#79172

fix(gateway): hide transcript-only history artifacts#79172
dev111-actor wants to merge 1 commit into
openclaw:mainfrom
dev111-actor:fix/transcript-artifact-history-sanitization

dev111-actor commented May 8, 2026

Uh oh!

clawsweeper Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dev111-actor commented May 8, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

clawsweeper Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented May 8, 2026 •

edited

Loading