Skip to content

fix(adapters): notify agent of failed image attachments so reply aligns with warning#857

Open
howie wants to merge 17 commits into
openabdev:mainfrom
howie:fix/776-agent-reply-alignment
Open

fix(adapters): notify agent of failed image attachments so reply aligns with warning#857
howie wants to merge 17 commits into
openabdev:mainfrom
howie:fix/776-agent-reply-alignment

Conversation

@howie
Copy link
Copy Markdown
Contributor

@howie howie commented May 19, 2026

Replaces #826, which was auto-closed by the stale-PR workflow. The Discord Discussion URL was already present in the original PR body.

Summary

Follow-up to #793, addressing the minor UX observation noted in the final review comment on #776 (antigenius0910, 2026-05-14):

When validation fails on the image, the user sees two sequential bot messages:

  1. The synthetic :warning: from the slack adapter
  2. The actual agent reply, which is unaware an image was attempted and responds as if the user just sent text-only (e.g. "I don't see any image attached…")

Root cause

download_and_encode_image rejects the bad attachment and the adapter sends the user a :warning: message. But extra_blocks contains no trace of the failed attachment, so the agent prompt is identical to a text-only message — the agent has no way to know an image was attempted and silently dropped.

Fix

After sending the user-visible warning, push a ContentBlock::Text system note into extra_blocks:

[Attachment validation failed]: the user attempted to attach `photo.png` but
the file(s) could not be processed (unsupported format, corrupt body, or size
limit exceeded). The user has been notified separately — acknowledge the
failure and proceed without apologising for not seeing the file.

AdapterRouter::pack_arrival_event partitions Text blocks before the typed prompt (stable insertion order), so the note appears after any STT transcript but before the user's message — the correct position for meta-context.

Changes

File Change
src/media.rs Add format_failed_attachment_note(filenames: &[&str]) -> String helper + 3 unit tests
src/slack.rs Inject note into extra_blocks after the existing :warning: send (~4 lines)
src/discord.rs Introduce failed_image_files Vec; split catch-all Err(e) into variant-specific arms (mirroring Slack); send Discord-flavored user warning (without Slack-specific files:read scope hint); inject note

Before / After (Slack thread, missing files:read)

Before (#793): user sees :warning:, then agent: "I don't see any image attached — could you share it again?"

After: user sees :warning:, then agent: "I see your image couldn't be processed due to a validation error. If you've added the files:read scope, try sharing it again in a new thread."

Design notes

  • format_failed_attachment_note lives in src/media.rs next to MediaFetchError — single source of truth shared by both adapters.
  • push (not insert(0, …)) because STT transcripts are the primary content and use insert(0, …) to win front position; failed-image notes are meta-info and correctly trail any transcript.
  • Size-exceeded entries already carry the reason inline ("big.png (exceeds 10000000 byte limit)"), which renders fine inside backticks in the agent note.
  • The agent note goes to the model, not Slack mrkdwn — no sanitize_slack_filename needed.

Dependency

This PR depends on #793. The failed_image_files Vec and MediaFetchError variants used here were introduced in that PR and do not exist in main yet. Please merge #793 first (or rebase this branch onto main after it lands — the diff is clean).

Test plan

  • Unit tests: cargo test --locked media::tests format_failed_attachment_note — 3 new tests (single file, multiple files, size-suffix preserved)
  • Existing test suite: cargo test --locked — no regressions expected (additive change only)
  • Manual E2E (mirror antigenius0910's Pass 2 rig):
    1. Bot token missing files:read → upload image → user sees :warning: (unchanged); agent reply now acknowledges the failure rather than asking "where's the image?"
    2. Follow-up text-only message in same thread succeeds normally

Closes #776 (UX follow-up noted in final review comment).

Discord Discussion URL: https://discord.com/channels/1491295327620169908/1491969620754567270/1505114156376784946

howie and others added 17 commits May 11, 2026 19:48
…odel

Fixes openabdev#776.

When a Slack bot token lacks the `files:read` OAuth scope, Slack serves the
workspace login HTML page (~55 KB) at HTTP 200 with a `text/html` Content-Type
instead of the requested file binary.  `download_and_encode_image` previously
accepted this response because:

1. It never inspected the HTTP response `Content-Type` header.
2. On `resize_and_compress` failure for a body ≤ 1 MB it fell back to
   forwarding the raw bytes under the Slack-reported MIME (`image/png`),
   bypassing any format check.

The result: a `ContentBlock::Image { media_type: "image/png", data: <base64 HTML> }`
flowed through to Anthropic, which 400'd with "Could not process image".
Because claude-agent-acp persists the user message into the session JSONL before
the API reply, the bad block replayed on every subsequent turn in that Slack
thread until an operator manually deleted the JSONL inside the pod.

Changes:
- Add `MediaFetchError` enum to `src/media.rs` so callers can distinguish
  "not an image, skip silently" (`NotAnImage`) from "claimed image, got
  unexpected bytes" (`UnsupportedResponseType`, `InvalidImageBody`).
- Add `validate_image_response(content_type, body)` pure helper that:
  - Rejects any HTTP response whose Content-Type (stripped of params,
    lowercased) is not in `{image/png, image/jpeg, image/gif, image/webp}`.
  - Sniffs magic bytes via `image::ImageReader::with_guessed_format()` (no
    new dependencies) and rejects anything that doesn't decode as one of the
    four supported formats.
- Change `download_and_encode_image` signature from `-> Option<ContentBlock>`
  to `-> Result<ContentBlock, MediaFetchError>`, capturing the Content-Type
  header before consuming the response with `.bytes()`.
- Remove the ≤ 1 MB resize-error fallback that was the direct bug path.
- Update `src/slack.rs` call site: on validation failure, collect filenames
  and post one aggregated user-visible warning to the Slack thread:
  ":warning: I couldn't access the file(s) you shared (`<name>`). This often
  means the bot is missing the `files:read` OAuth scope. Please ask an admin
  to reinstall the app with that scope."
- Update `src/discord.rs` call site: `warn!` log on failure (Discord URLs are
  signed-public so the Slack scope hint is not applicable there). Preserve the
  existing `is_video_file` fallback for `Err(NotAnImage)`.
- Add 12 unit tests for `validate_image_response` including the exact bug
  repro case (HTML body labeled `image/png`, first 8 bytes `3c21444f43545950`).

Out of scope / follow-up issues:
- Secondary defense: deferring claude-agent-acp JSONL persistence until after
  model returns 200 (requires changes in the claude-agent-acp Node project).
- Startup preflight calling Slack `auth.test` to warn loudly on missing scopes.
- Same Content-Type/magic-byte hardening for `download_and_transcribe` and
  `download_and_read_text_file`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Remove dead hinted field from UnsupportedResponseType (always None)
- Eliminate double reader.format() call with fmt@ binding
- Deduplicate hex_prefix() in resize error path (compute once, reuse)
- Promote strip_mime_params to media::strip_mime_params (pub crate),
  slack.rs delegates to it -- single source of truth for MIME stripping

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Critical: change Content-Type check from allow-list to block-list (Codex
finding). The allow-list rejected application/octet-stream before magic-byte
check ran, silently dropping valid images from CDNs. Only text/* is now
rejected early; everything else falls through to magic-byte verification.

Also:
- Soften Slack warning message: no longer attributes all failures to
  files:read scope; now mentions format support as a second cause
- Add SizeExceeded to Slack user notification (was silent)
- Log failures from send_message() instead of using let _ =
- Log discarded io::Error from with_guessed_format
- Fix doc comments: download_and_encode_image (SizeExceeded fires pre-HTTP),
  validate_image_response (Content-Type check short-circuits, not sequential)
- Replace inline "Validate Content-Type..." comment with WHY explanation
- Restore doc comment on strip_mime_params wrapper in slack.rs
- Add tests: octet-stream acceptance (Codex regression fix), JSON body
  rejection by magic bytes, missing Content-Type + invalid body

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex adversarial review found that user-controlled filenames embedded in
the mrkdwn warning message could inject Slack markup (backtick break-out,
<!here> mentions, <@uid> pings). Replace backticks and angle brackets with
safe ASCII equivalents before embedding in the message.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
F1: validate_gif_body now decodes only the first frame instead of
    collect_frames() — avoids full in-memory decode of large animated GIFs.

F2: remove duplicate validate_gif_body call from resize_and_compress;
    download_and_encode_image already runs validate_image_response before
    calling resize, so the second call was redundant.

F3: add MediaFetchError::ProcessingFailed(image::ImageError) for the case
    where body passed validation but resize/compress failed — previously
    returned the misleading InvalidImageBody variant for a validated image.

F4: extend Slack warning message to mention "file is too large" so the
    message is accurate when SizeExceeded failures are included.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Behavior:
- slack: add explicit ProcessingFailed arm -> push to failed_image_files
  and log "post-processing failed" (not "download failed")
- slack: extract sanitize_slack_filename() pub(crate) fn; add 4 unit tests
  for backtick/angle-bracket injection prevention

API:
- validate_image_response: change return type Result<ImageFormat> -> Result<()>
  (sole caller only checked Ok/Err; format detection ran twice)

Docs:
- validate_image_response: add block-list vs allow-list design rationale
- validate_gif_body: add doc comment explaining first-frame-only and cursor
  independence; log original error via debug! before mapping to InvalidImageBody
- ProcessingFailed variant: expand doc to clarify semantic difference from
  InvalidImageBody and expected caller behavior
- download_and_encode_image: add ProcessingFailed to error listing

Tests:
- validate_rejects_mixed_case_text_content_type: pin .to_lowercase() normalization

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Restore ≤1MB raw-byte fallback when resize_and_compress fails after
  validate_image_response passes. The body is confirmed valid by magic-byte
  check, so forwarding original bytes is safe. Prevents regression for
  formats the image crate can detect but not fully decode (e.g. animated WebP).

- Add HttpStatus 4xx match arm in Slack handler to push filename into
  failed_image_files. HTTP 4xx (401/403) indicates a persistent permission
  problem (similar root cause to openabdev#776) and the user should be notified.
  5xx and Network errors remain log-only (transient).
…t reply aligns with warning

When download_and_encode_image rejects an attachment, the Slack/Discord
adapter already sends the user a visible ⚠️ message. Without this
change, the agent sees no attachment and may reply "I don't see any image
attached" -- contradicting the warning the user just received.

- Add media::format_failed_attachment_note() as a shared helper (single
  source of truth) for the ContentBlock::Text injected into extra_blocks.
- Slack: push the note after the existing warning send (pack_arrival_event
  partitions Text blocks before the typed prompt; stable iteration order
  means push trails any STT transcript, which is correct).
- Discord: introduce failed_image_files Vec, split the catch-all Err arm
  into variant-specific arms matching Slack, send a Discord-flavored user
  warning (without the Slack files:read scope hint), and inject the note.
- Add three unit tests for format_failed_attachment_note in media::tests.

Addresses the minor UX observation from the openabdev#776 final review comment
(antigenius0910, 2026-05-14). Depends on PR openabdev#793 (failed_image_files Vec
and MediaFetchError variants introduced there).

Fixes openabdev#776 (UX follow-up).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- format_failed_attachment_note: &[&str] -> &[String] to remove
  intermediate Vec<&str> at both call sites
- Remove inaccurate "Caller must guarantee non-empty" doc precondition
- discord.rs: ProcessingFailed(ref e) -> ProcessingFailed(e) (spurious ref)
- discord.rs: sanitize filenames before embedding in user warning
  (backtick closes code span; <> enables mention injection in Discord)
- slack.rs: trim 4-line ordering comment to 3-line essential WHY

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ize helper

- Move send_message into the correct thread: warn_channel was built from
  msg.channel_id before get_or_create_thread ran, routing the warning to
  the parent channel while the agent reply landed in the newly-created
  thread. Now send_message uses thread_channel (resolved first).
- Extract sanitize_discord_filename (parity with sanitize_slack_filename)
  and add 3 tests covering normal, backtick, and injection-attempt cases.
- Add channel_id to the warning-send failure log for debuggability.
- Add comment on catch-all Err arm: Network/HTTP drops are intentional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l path

- format_failed_attachment_note replaces backtick with quote in filenames
  to prevent broken Markdown code-spans in the LLM prompt.
- Add test: format_failed_attachment_note_escapes_backtick_in_filename.
- Document the get_or_create_thread failure early-return: agent dispatch
  and user-visible warning are both dropped, consistent with pre-existing
  thread-fail behavior for any message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses two regression surfaces flagged in reviewer Pass-2 follow-up:

1. media.rs: resize-fallback threshold (restored by 6d7fd5f, missing from
   this branch before the rebase).  Extract encode_validated_image() to make
   the <=1 MB fallback testable without HTTP I/O.  Add two new tests:
   - resize_fail_under_1mb_falls_back_to_original_bytes
   - resize_fail_over_1mb_returns_processing_failed

2. slack.rs / discord.rs: HTTP 4xx silently dropped by catch-all Err(e) arm.
   Add shared media::failed_attachment_entry() helper (pure fn, no logging)
   that maps MediaFetchError -> Option<String>: 4xx returns Some (user
   notified, agent receives format_failed_attachment_note); 5xx/network
   returns None (logged only).  Both adapters now call the helper instead of
   duplicating the multi-arm match, and discord.rs gains parity with the 4xx
   arm that 6d7fd5f added only to slack.rs.

New tests in media::tests (5) and discord::tests (2) covering:
- failed_attachment_entry: NotAnImage, SizeExceeded, 401, 403, 502
- discord parity: 403 notifies, 500 is logged-only

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Extract sanitize_attachment_filename into media.rs as canonical helper;
  sanitize_slack_filename and sanitize_discord_filename delegate to it
- Collapse multi-arm image-error matches in slack.rs and discord.rs to
  two arms (NotAnImage no-op + unified Err(e) via failed_attachment_entry)
- Remove two redundant discord parity tests that duplicated media.rs coverage
- Add wildcard comment to failed_attachment_entry clarifying opt-in default

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- failed_attachment_entry: SizeExceeded returns bare filename only
- failed_attachment_entry: replace wildcard with explicit Network/HttpStatus arms
- encode_validated_image: narrow WebP fallback to mime == image/webp only
- encode_validated_image: elevate fallback log from debug to warn
- discord.rs: move agent-note push to after thread_channel is resolved
- discord.rs: remove duplicate doc on sanitize_discord_filename
- slack.rs: clarify push comment re STT insert ordering dependency
- media.rs: document format_failed_attachment_note sanitizes internally
- media.rs: document NotAnImage => None is deliberate skip not transient
- Tests: update SizeExceeded assertion; add non-WebP resize-fail test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@howie howie marked this pull request as ready for review May 19, 2026 14:30
@howie howie requested a review from thepagent as a code owner May 19, 2026 14:30
@shaun-agent
Copy link
Copy Markdown
Contributor

shaun-agent commented May 19, 2026

OpenAB PR Screening

This is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Click 👍 if you find this useful. Human review will be done within 24 hours. We appreciate your support and contribution 🙏

Screening report done. comment posted and project item moved to `PR-Screening`.

GitHub comment: #857 (comment)
Project action: PVTI_lADOEFbZWM4BUUALzgtMgAU is now PR-Screening in https://github.com/orgs/openabdev/projects/1

Intent

PR #857 fixes a user-visible attachment failure mismatch in Slack and Discord: the adapter warns that an image failed, but the agent reply may still act like no image was attempted.

Feat

Classification: fix.

Adds model-visible failed-attachment context after adapter warnings so the agent can acknowledge the failed image instead of asking where it is.

Who It Serves

Discord users, Slack users, maintainers, and reviewers.

Rewritten Prompt

Implement a follow-up to #793 so failed Slack/Discord image validation both warns the user and appends a model-visible note to extra_blocks, preserving successful image/audio/text behavior and STT ordering. Add focused formatter tests and verify at least one failed-image E2E before merge.

Merge Pitch

Worth advancing after #793 lands. Risk is low conceptually, but the dependency on #793 and the larger src/media.rs diff need reviewer attention.

Best-Practice Comparison

OpenClaw and Hermes Agent mostly do not apply directly. The relevant shared principle is explicit delivery context: failed attachments should be visible in the prompt/run context, not only as an adapter-side warning.

Implementation Options

Conservative: merge after #793 with targeted tests and one manual E2E.

Balanced: merge after #793, confirm adapter insertion behavior where practical, and do platform E2E coverage if credentials are available.

Ambitious: generalize failed attachment notes across all attachment types and adapters.

Comparison Table

Option Speed Complexity Reliability Maintainability User Impact Fit for OpenAB now
Conservative High Low Medium Medium Medium Good
Balanced Medium Medium High High High Best
Ambitious Low High High Medium High Too broad

Recommendation

Use the balanced path. Merge/rebase after #793, keep this PR focused on failed image UX, require targeted tests plus at least one manual failed-image E2E, and split broader attachment-failure normalization into follow-up work.

@chaodu-agent
Copy link
Copy Markdown
Collaborator

CHANGES REQUESTED 🟡 — Solid follow-up to #793. The agent-note injection and user-warning are well-designed. A few items to address before merge.

1. What problem does this solve?

After #793 rejects a bad image attachment, the user sees a :warning: but the agent has no idea an image was attempted — it responds as if the user sent text-only ("I don't see any image attached…"). This PR injects a [Attachment validation failed] note into extra_blocks so the agent acknowledges the failure gracefully.

2. How does it solve it?
  • Changes download_and_encode_image return type from Option<ContentBlock> to Result<ContentBlock, MediaFetchError> with a rich error enum.
  • Adds validate_image_response() — Content-Type block-list + magic-byte sniffing via image::ImageReader::with_guessed_format().
  • On validation failure: collects filenames into failed_image_files, sends user-visible warning to the thread, and pushes a ContentBlock::Text system note into extra_blocks for the agent.
  • Both Discord and Slack adapters get the same treatment.
  • Filename sanitization (sanitize_attachment_filename) prevents backtick/angle-bracket injection in both mrkdwn and Discord markdown.
3. What was considered?
  • Strict Content-Type allow-list: rejected because CDNs serve valid images as application/octet-stream.
  • Keep the old raw-byte fallback after resize failure: preserved the original poisoning path.
  • Fix only Slack: left Discord exposed.
  • The resize-failure fallback is now restricted to WebP ≤1MB only (animated WebP that the image crate cannot decode).
4. Traffic Light

🟡 NIT — Dependency on unmerged #793: PR body states "depends on #793" which is still OPEN. The diff includes the full MediaFetchError enum and validate_image_response that #793 introduced. If #793 lands first, this PR's diff will shrink dramatically. If this is meant to supersede #793 entirely (since it includes all of #793's changes), the PR description should clarify that and #793 should be closed. As-is, the relationship is ambiguous.

🟡 NIT — sanitize_discord_filename is a trivial wrapper: The function just delegates to media::sanitize_attachment_filename. Consider calling media::sanitize_attachment_filename directly in discord.rs (like Slack's sanitize_slack_filename does) or removing the indirection. Both wrappers exist solely for naming clarity but add no logic — pick one pattern.

🟢 INFO — Good security posture: Filename sanitization from day one (backtick → single-quote, angle brackets → parens) prevents injection in code-spans and Slack mention syntax. The text/ Content-Type block-list + magic-byte validation is the right layered approach.

🟢 INFO — Test coverage is excellent: 13+ validate_image_response tests including the exact #776 reproduction (HTML body labeled as image/png), corrupt GIF, truncated PNG header, empty body, and the application/octet-stream CDN case.

🟢 INFO — Correct insertion order: push (not insert(0, …)) for the failed-attachment note is correct — STT transcripts use insert(0, …) to win front position; meta-info correctly trails.


Verdict: 🟡 Clarify the relationship with #793 (supersedes or depends-on?) and simplify the wrapper indirection. No correctness issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(slack): unauthenticated file fetch returns HTML, gets forwarded to model as image, poisons session

4 participants