Skip to content

fix(agents): avoid duplicate generated media fallback#89220

Merged
omarshahine merged 2 commits into
mainfrom
fix/generated-media-targetless-dedupe
Jun 1, 2026
Merged

fix(agents): avoid duplicate generated media fallback#89220
omarshahine merged 2 commits into
mainfrom
fix/generated-media-targetless-dedupe

Conversation

@omarshahine
Copy link
Copy Markdown
Contributor

@omarshahine omarshahine commented Jun 1, 2026

Summary

  • Treat targetless current-chat message-tool media telemetry as delivered for generated-media completion dedupe.
  • Keep fallback delivery for targetless telemetry that names a different provider/account/thread.
  • Add regression coverage for both the observed duplicate-send shape and the mismatched-provider safety case.

Verification

  • ./node_modules/.bin/oxfmt --check --threads=1 src/agents/subagent-announce-delivery.ts src/agents/subagent-announce-delivery.test.ts
  • node scripts/run-oxlint.mjs src/agents/subagent-announce-delivery.ts src/agents/subagent-announce-delivery.test.ts
  • node scripts/run-vitest.mjs src/agents/subagent-announce-delivery.test.ts -> 92 tests passed
  • git diff --check
  • .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main -> clean, no accepted/actionable findings

Real behavior proof

Behavior addressed: generated-media completion fallback no longer sends a duplicate The generated image is ready. copy when the message tool already delivered media to the current chat but telemetry omitted an explicit to target.

Real environment tested: local macOS Messages/iMessage private API with the LaunchAgent gateway running the rebuilt dist that included this patch.

Exact steps or command run after this patch: Sent this real iMessage: Ok generate a new image now of a Lobster ready to launch like a Rocket into space. Then inspected /opt/homebrew/bin/imsg history --chat-id 3 --limit 30 --attachments --json and ~/.openclaw/logs/openclaw.log.

Evidence after fix: iMessage DB shows inbound request id 5805 at 2026-06-01T20:55:49.931Z, followed by exactly one outgoing generated-image reply id 5806 at 2026-06-01T20:57:14.450Z with one PNG attachment (upload.png, 2360236 bytes) and text 🦞 Here’s the rocket-launch lobster image. The next latest generated-image run before this patch showed duplicate fallback messages such as The generated image is ready.; the post-fix run has no such follow-up fallback after id 5806.

Observed result after fix: requester confirmed the generated-image turn delivered once. Gateway log for the same turn shows the final source reply was suppressed under sourceReplyDeliveryMode: message_tool_only after media delivery (textChars=0 media=1) and the dispatch completed successfully in about 84s.

What was not tested: CI cannot exercise real Messages.app delivery; live proof is from the local macOS iMessage setup and the requester's observation.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Jun 1, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 1, 2026

Codex review: needs maintainer review before merge. Reviewed June 1, 2026, 5:05 PM ET / 21:05 UTC.

Summary
The PR changes generated-media fallback dedupe so targetless current-chat message-tool media telemetry can count as delivered while mismatched provider or target telemetry still triggers fallback delivery.

PR surface: Source +15, Tests +100. Total +115 across 2 files.

Reproducibility: yes. Current-main source shows target media URLs are marked as targeted before target matching, which makes a targetless target record able to block global media evidence and allow a duplicate fallback.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Maintainer should confirm the targetless telemetry contract and decide whether to land with the remaining message-delivery risk.

Risk before merge

  • [P2] The PR changes when generated-media recovery fallback is suppressed, so an overly broad targetless telemetry match could skip a recovery send instead of only removing a duplicate.

Maintainer options:

  1. Accept the focused delivery contract (recommended)
    A maintainer can land after accepting that targetless current-chat media evidence is valid only when the existing provider/account/thread matcher still passes.
  2. Ask for owner verification first
    Hold for an agent-delivery owner to confirm the targetless telemetry shape if the iMessage proof is not enough to settle the fallback-suppression risk.

Next step before merge

  • No automated repair is needed; the remaining action is maintainer review/landing because the PR has a protected maintainer label and a message-delivery fallback-risk decision.

Security
Cleared: The diff is limited to agent delivery dedupe logic and tests, with no concrete security or supply-chain concern found.

Review details

Best possible solution:

Land the narrow dedupe fix after maintainer review accepts the targetless current-chat telemetry contract and the mismatched-provider safety coverage.

Do we have a high-confidence way to reproduce the issue?

Yes. Current-main source shows target media URLs are marked as targeted before target matching, which makes a targetless target record able to block global media evidence and allow a duplicate fallback.

Is this the best way to solve the issue?

Yes. The PR fixes the dedupe decision at the generated-media fallback boundary and reuses the existing sourceDeliveryTargetsMatch contract, with positive and mismatched-provider regression coverage.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f8491b0fcf12.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The updated PR body includes after-fix real iMessage DB and gateway log evidence showing exactly one generated-media reply and no duplicate fallback.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The updated PR body includes after-fix real iMessage DB and gateway log evidence showing exactly one generated-media reply and no duplicate fallback.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P2: This is a bounded user-visible duplicate generated-media delivery bug with focused tests and live proof, not a release-blocking outage.
  • merge-risk: 🚨 message-delivery: The diff changes fallback suppression for generated-media sends, where a wrong match could drop a recovery delivery or a missed match could duplicate a message.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The updated PR body includes after-fix real iMessage DB and gateway log evidence showing exactly one generated-media reply and no duplicate fallback.
  • proof: sufficient: Contributor real behavior proof is sufficient. The updated PR body includes after-fix real iMessage DB and gateway log evidence showing exactly one generated-media reply and no duplicate fallback.
Evidence reviewed

PR surface:

Source +15, Tests +100. Total +115 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 21 6 +15
Tests 1 100 0 +100
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 121 6 +115

What I checked:

  • Current-main fallback path: Current main suppresses generated-media direct fallback only when expected media is found in delivered media evidence; otherwise it sends the generated-media direct fallback. (src/agents/subagent-announce-delivery.ts:956, f8491b0fcf12)
  • Current-main targetless bug source: Current main adds every target record's media URL to targetedUrls before checking whether the target matches, so a targetless target record can prevent global media evidence from counting as delivered. (src/agents/subagent-announce-delivery.ts:1096, f8491b0fcf12)
  • PR implementation: The PR validates target records first, treats missing-to records as delivered only when filling in deliveryTarget.to still satisfies sourceDeliveryTargetsMatch, and keeps mismatched records in targetedUrls so fallback remains available. (src/agents/subagent-announce-delivery.ts:1111, 971ed40839b8)
  • Regression coverage: The PR adds focused tests for targetless current-chat media suppressing duplicate fallback and for targetless media with a different provider still falling back. (src/agents/subagent-announce-delivery.test.ts:2357, 971ed40839b8)
  • Target matching contract: sourceDeliveryTargetsMatch requires channel/to and rejects mismatched provider, account, or thread data; the PR reuses that existing contract for targetless telemetry safety. (src/infra/outbound/source-delivery-plan.ts:135, f8491b0fcf12)
  • Real behavior proof: The PR body was updated with an after-fix local macOS Messages/iMessage run showing one outgoing generated-image reply with one PNG attachment and no follow-up fallback, plus gateway log evidence for message_tool_only suppression after media delivery. (971ed40839b8)

Likely related people:

  • steipete: git blame and git log attribute the current generated-media fallback helper and source-delivery matching contract in this checkout to Peter Steinberger's commit 9153aab. (role: recent area contributor; confidence: high; commits: 9153aab037bd; files: src/agents/subagent-announce-delivery.ts, src/infra/outbound/source-delivery-plan.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 1, 2026
@omarshahine
Copy link
Copy Markdown
Contributor Author

Live proof from the latest iMessage run has been added to the PR body.

  • Request: Ok generate a new image now of a Lobster ready to launch like a Rocket into space.
  • iMessage DB: inbound id 5805 at 2026-06-01T20:55:49.931Z; exactly one outgoing generated-image reply id 5806 at 2026-06-01T20:57:14.450Z.
  • Attachment: one PNG (upload.png, 2360236 bytes) with text 🦞 Here’s the rocket-launch lobster image.
  • Regression check: no follow-up The generated image is ready. fallback after id 5806; earlier pre-fix runs did show that duplicate fallback.
  • Gateway log: final source reply was suppressed after media delivery (sourceReplyDeliveryMode: message_tool_only, textChars=0 media=1) and the turn completed successfully in about 84s.

CI on this PR is green at 971ed40839b891541e270c74a33f1cbb917abd5f.

@omarshahine omarshahine self-assigned this Jun 1, 2026
@omarshahine omarshahine merged commit 12798eb into main Jun 1, 2026
242 of 255 checks passed
@omarshahine omarshahine deleted the fix/generated-media-targetless-dedupe branch June 1, 2026 23:46
@omarshahine
Copy link
Copy Markdown
Contributor Author

Landed via squash merge.

  • PR head: 971ed40839b891541e270c74a33f1cbb917abd5f
  • Merge commit: 12798eb789bd61c873dcc59baa0599aba733a116
  • Gate: GitHub check rollup green before merge, including Real behavior proof, check-prod-types, check-test-types, and check-additional-extension-package-boundary.
  • Behavior: generated-media completion fallback no longer sends a duplicate current-chat copy after targetless message-tool media delivery.

Thanks @omarshahine.

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 2, 2026
Treat targetless current-chat message-tool media telemetry as delivered for generated-media completion dedupe while preserving fallback delivery for mismatched provider/account/thread evidence.

Real behavior proof was added from the live iMessage generated-image run: inbound id 5805, exactly one outgoing media reply id 5806, and no follow-up generated-image fallback.

Co-authored-by: omarshahine <10343873+omarshahine@users.noreply.github.com>
Reviewed-by: @lobster
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Treat targetless current-chat message-tool media telemetry as delivered for generated-media completion dedupe while preserving fallback delivery for mismatched provider/account/thread evidence.

Real behavior proof was added from the live iMessage generated-image run: inbound id 5805, exactly one outgoing media reply id 5806, and no follow-up generated-image fallback.

Co-authored-by: omarshahine <10343873+omarshahine@users.noreply.github.com>
Reviewed-by: @lobster
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant