Skip to content

[AI-assisted] fix(agents): normalize malformed assistant content#82748

Merged
steipete merged 2 commits into
openclaw:mainfrom
IWhatsskill:fix/malformed-assistant-content-43795
May 17, 2026
Merged

[AI-assisted] fix(agents): normalize malformed assistant content#82748
steipete merged 2 commits into
openclaw:mainfrom
IWhatsskill:fix/malformed-assistant-content-43795

Conversation

@IWhatsskill
Copy link
Copy Markdown
Contributor

@IWhatsskill IWhatsskill commented May 16, 2026

Summary

  • Problem: malformed assistant content replayed from provider/session history can be an object or null, then reach transport conversion code that expects iterable content and fail with v.content is not iterable.
  • Why it matters: one bad assistant replay entry can break the run-attempt path instead of being treated as owned malformed provider/history input.
  • What changed: replay history now normalizes object-shaped assistant content into a single content block and normalizes null/other malformed non-array shapes to []; transport conversion also has a final defensive guard for malformed assistant content.
  • What did NOT change: no provider routing, config shape, gateway delivery, channel behavior, network calls, or credential handling.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: malformed assistant replay content no longer reaches transport conversion as a non-iterable value.
  • Real environment tested: local OpenClaw checkout on Windows using the bundled Codex Node runtime; no external provider credentials were used.
  • Exact steps or command run after this patch:
node --import tsx ../../openclaw/proofs/PR-2026-05-17-malformed-assistant-content-runtime-proof.mjs
node scripts/run-vitest.mjs src/agents/pi-embedded-runner/replay-history.test.ts src/agents/transport-message-transform.test.ts
node scripts/run-vitest.mjs src/agents/pi-embedded-runner/sanitize-session-history.tool-result-details.test.ts src/agents/openai-transport-stream.test.ts src/agents/anthropic-transport-stream.test.ts
pnpm check:changed
git diff --check
  • Evidence after fix:
OpenClaw malformed assistant content runtime proof
Path: normalizeAssistantReplayContent -> transformTransportMessages
Input: assistant content object + assistant content null from a provider-compatible replay shape
{
  "assistantSummaries": [
    {
      "index": 0,
      "contentIsArray": true,
      "contentLength": 1,
      "firstBlockType": "text",
      "textPreserved": "runtime object payload"
    },
    {
      "index": 1,
      "contentIsArray": true,
      "contentLength": 0,
      "firstBlockType": null,
      "textPreserved": null
    }
  ]
}
RESULT: PASS - malformed assistant replay content is normalized before transport conversion

Test Files  2 passed (2)
Tests       33 passed (33)

Test Files  6 passed (6)
Tests       378 passed (378)

pnpm check:changed exited 0
git diff --check exited 0
  • Observed result after fix: the non-test runtime proof executes the real normalizeAssistantReplayContent -> transformTransportMessages path; object-shaped assistant content is preserved as one content block; null/other malformed non-array content becomes an empty content array; the transport conversion path completes without the iterable crash.
  • What was not tested: a live Feishu/GptGod provider run or any external provider API call.
  • Before evidence: the linked issue and source path show the malformed replay shape; I did not run a live failing provider reproduction.

Root Cause (if applicable)

  • Root cause: replay normalization handled strings and arrays, but object/null assistant content could pass through unchanged.
  • Missing detection / guardrail: the downstream image-sanitizer only processed assistant content arrays, while transport conversion assumed assistant content was already iterable.
  • Contributing context (if known): prior fixes covered string content, but did not fully close non-array malformed assistant content.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/pi-embedded-runner/replay-history.test.ts
    • src/agents/transport-message-transform.test.ts
  • Scenario the test should lock in: object-shaped assistant content is normalized to an array block; null assistant content is normalized to []; transport conversion keeps valid arrays and guards malformed non-arrays.
  • Why this is the smallest reliable guardrail: the bug is in local replay/transport message shaping, so focused tests exercise the failing seam without provider credentials.
  • Existing test that already covers this (if any): none for object/null assistant content.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

No intended change for valid messages. Malformed assistant replay/provider content is normalized instead of crashing transport conversion.

Diagram (if applicable)

Before:
assistant content object/null -> replay keeps malformed shape -> transport iterates content -> crash

After:
assistant content object/null -> replay/transport normalize shape -> transport conversion continues

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

  • OS: Windows
  • Runtime/container: local OpenClaw checkout, bundled Codex Node runtime
  • Model/provider: N/A, no live provider call
  • Integration/channel (if any): N/A
  • Relevant config (redacted): N/A

Steps

  1. Run the focused replay and transport regression tests.
  2. Run adjacent OpenAI/Anthropic transport and sanitizer tests.
  3. Run pnpm check:changed and git diff --check.

Expected

  • Malformed assistant content is normalized before the iterable transport path.

Actual

  • Tests pass and the changed files pass check:changed and git diff --check.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What I personally verified:

  • Verified scenarios: object-shaped assistant content and null assistant content through a non-test runtime proof that imports the real replay and transport functions; valid array content, valid string content, and direct transport conversion through focused regression tests.
  • Edge cases checked: object content is not stringified or logged; null/other malformed shapes become []; existing string/array behavior remains covered.
  • What I did not verify: live Feishu/GptGod provider reproduction, live gateway run, or external provider API call.

Review Conversations

N/A - no review conversations yet.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A.

Risks and Mitigations

  • Risk: the defensive transport guard could make malformed provider/history input less noisy than a hard crash.
    • Mitigation: the normalization is narrow, preserves object-shaped content as a block, turns only malformed non-array/non-string shapes into empty content, adds focused regression coverage, and does not log or stringify untrusted payload content.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 16, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 16, 2026

Codex review: needs real behavior proof before merge.

Summary
Review failed before ClawSweeper could summarize the requested change.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Real behavior proof
Not applicable: Real behavior proof was not assessed because the Codex review failed.

Next step before merge
Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)

Remaining risk / open question:

  • No close action taken because the review did not complete.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9f112a1a7a40.

@clawsweeper clawsweeper Bot added the P1 High-priority user-facing bug, regression, or broken workflow. label May 16, 2026
@IWhatsskill
Copy link
Copy Markdown
Contributor Author

IWhatsskill commented May 16, 2026

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 16, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 16, 2026
@IWhatsskill IWhatsskill marked this pull request as ready for review May 16, 2026 22:16
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 16, 2026
@steipete steipete force-pushed the fix/malformed-assistant-content-43795 branch from 745ea7a to 7dd55c6 Compare May 17, 2026 01:53
steipete added a commit to IWhatsskill/openclaw that referenced this pull request May 17, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 17, 2026
@steipete steipete force-pushed the fix/malformed-assistant-content-43795 branch from 7dd55c6 to 608fa05 Compare May 17, 2026 01:56
steipete added a commit to IWhatsskill/openclaw that referenced this pull request May 17, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 17, 2026
@openclaw-barnacle openclaw-barnacle Bot added channel: whatsapp-web Channel integration: whatsapp-web extensions: memory-core Extension: memory-core labels May 17, 2026
@clawsweeper clawsweeper Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 17, 2026
@steipete steipete force-pushed the fix/malformed-assistant-content-43795 branch from c67d031 to fcd4895 Compare May 17, 2026 02:21
@openclaw-barnacle openclaw-barnacle Bot removed channel: whatsapp-web Channel integration: whatsapp-web extensions: memory-core Extension: memory-core labels May 17, 2026
@steipete
Copy link
Copy Markdown
Contributor

Verification for head fcd4895:

Behavior addressed: malformed assistant replay content shaped as a single object or null is normalized before transport conversion, and zero-usage empty/null stop turns still get the replay sentinel instead of regressing to silent empty content.
Real environment tested: local macOS checkout plus Blacksmith Testbox through Crabbox.
Exact steps or command run after this patch:

  • node scripts/run-vitest.mjs src/agents/pi-embedded-runner/replay-history.test.ts src/agents/transport-message-transform.test.ts
  • node --import tsx --input-type=module with direct normalizeAssistantReplayContent/transformTransportMessages object/null replay proof
  • /Users/steipete/Projects/agent-scripts/skills/codex-review/scripts/codex-review --mode branch --base origin/main
  • node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox --blacksmith-org openclaw --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job check --blacksmith-ref main --idle-timeout 90m --ttl 240m --timing-json -- CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_REMOTE_RUN=1 pnpm check:changed
  • gh run watch 25979009591 --repo openclaw/openclaw --exit-status
    Evidence after fix: focused Vitest passed 34 tests; direct runtime proof printed result: PASS; Codex review reported no accepted/actionable findings; Testbox provider=blacksmith-testbox id=tbx_01krsvt8ag597z3th36jje93kd exit=0; GitHub CI run 25979009591 passed.
    Observed result after fix: object assistant content becomes one text block, null content becomes an empty block array except zero-usage stop replay turns, which are converted to the existing fallback sentinel before the next user turn.
    What was not tested: a live paid provider replay against the original reporter account/session was not available; the normalization path was exercised with the real replay/transport code locally and in CI/Testbox.

@steipete steipete merged commit ab595de into openclaw:main May 17, 2026
115 of 117 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling P1 High-priority user-facing bug, regression, or broken workflow. proof: supplied External PR includes structured after-fix real behavior proof. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants