feat(gateway): feishu image and text file attachment support#731
feat(gateway): feishu image and text file attachment support#731thepagent merged 3 commits intoopenabdev:mainfrom
Conversation
- Gateway downloads images via /im/v1/messages/{id}/resources/{key}?type=image
- resize_and_compress: max 1200px, JPEG quality 75, GIF pass-through
- Text files: whitelist extensions, 512KB cap, base64 encoded
- parse_message_event supports text/image/file/post message types
- post type: extracts text + img nodes (for @mention + paste image)
- GatewayEvent.content.attachments: backward compatible via serde(default)
- Core: decode attachments to ContentBlock::Image / ContentBlock::Text
- Empty text + empty attachments events are not forwarded
- Updated docs/feishu.md with Image & File Attachments section
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening report## IntentPR #731 adds Feishu inbound attachment handling so Feishu users can send images and supported text files to OpenAB agents. The operator-visible problem is that Feishu currently lags Discord attachment behavior: image/file messages either cannot reach the model as usable content or require separate manual workarounds. The PR also handles Feishu FeatFeature: Feishu gateway support for inbound image and text file attachments. Behavioral changes:
Who It ServesPrimary beneficiaries:
Rewritten PromptImplement inbound attachment support for the Feishu gateway. Add an additive For images, resize to a maximum dimension of 1200px, JPEG-compress at quality 75, base64 encode, and forward as image attachments. For text files, only accept known text extensions, enforce a 512KB limit, base64 encode, and forward as text-file attachments. If attachment download fails, preserve any usable text content. If a message has neither text nor valid attachments, do not send an event. In core gateway handling, deserialize attachments and convert them into model content blocks without changing the Merge PitchThis is worth advancing because it closes a real Feishu usability gap and brings the gateway closer to Discord parity. Screenshots and pasted images are common agent inputs, especially in chatops and support workflows. Risk profile is moderate. The main risk is not the schema addition, which is backward-compatible, but gateway responsibility expanding into media download, image processing, size control, and error handling. Reviewers will likely focus on resource limits, dependency impact from the Best-Practice ComparisonRelevant OpenClaw principles:
Relevant Hermes Agent principles:
Overall, the PR follows the strongest relevant principle from both systems: platform-specific gateways should normalize platform-specific inputs before handing them to core agent execution. Implementation OptionsOption 1: Conservative gateway-only image support Option 2: Balanced attachment parity Option 3: Generic cross-gateway attachment pipeline Option 4: Durable media ingestion service Comparison Table
RecommendationAdvance the balanced attachment parity path, with careful review around limits, error logging, MIME/extension checks, and dependency impact. The current PR appears scoped well for merge discussion because it solves a concrete Feishu gap without requiring a broader gateway redesign. Any generic cross-gateway attachment abstraction should be split into follow-up work after Feishu behavior is proven and reviewed against Discord’s existing implementation. |
This comment has been minimized.
This comment has been minimized.
Add pre-download size check via Content-Length header in both download_feishu_image and download_feishu_file to avoid buffering oversized responses before rejection. Post-download fallback check retained for cases where Content-Length is absent or misreported.
|
Added Content-Length early gate to both No behavior change from the user's perspective. |
This comment has been minimized.
This comment has been minimized.
- GIF filename: use .gif extension when format is GIF (was always .jpg) - WS path: align token error handling with webhook (if-let-Ok pattern) - Post parser: explicit 'at' tag arm with comment (mentions via envelope)
|
LGTM ✅ — Well-structured feature addition bringing Feishu to parity with Discord attachment handling. All NITs addressed in 四問框架 Review1. What problem does this solve?Feishu users could only send text messages to the bot. Images, text files, and rich-text posts (with pasted images) were silently dropped ( 2. How does it solve it?Architecture: Deferred download via Key implementation choices:
3. What was considered?
4. Is this the best approach?Yes. The design mirrors the existing Discord attachment pattern exactly. The 🟢 INFO — Things done well
🟡 NITs — All resolved in bee3fa8
|
Summary
Adds image and text file attachment support for the Feishu gateway adapter. Images are downloaded, resized, compressed, and forwarded to the AI agent as
ContentBlock::Image. Text files are downloaded and forwarded asContentBlock::Text. This brings Feishu to feature parity with Discord's attachment handling.Changes
gateway/src/schema.rsContentgainsattachments: Vec<Attachment>. NewAttachmentstruct with type/filename/mime_type/data/size. Backward compatible via#[serde(default)].gateway/Cargo.tomlimagecrate for resize/compressgateway/src/adapters/feishu.rsresize_and_compress(): 1200px max, JPEG quality 75.download_feishu_image(): resources API + compress + base64.download_feishu_file(): text files only (512KB cap).parse_message_event()returns(GatewayEvent, Vec<MediaRef>), acceptstext/image/file/posttypes. Callers (WS + webhook) do async download after parse. Empty text + empty attachments → event not sent.gateway/src/main.rsContent.attachmentsfieldsrc/gateway.rsattachmentsfrom GatewayEvent. Convertimage→ContentBlock::Image,text_file→ base64 decode →ContentBlock::Textwrapped in code fence. Pass asextra_blockstohandle_message().docs/feishu.mdDesign decisions
Gateway-side download — Feishu attachments require
tenant_access_token(gateway has it, core doesn't). Gateway downloads, compresses, and base64 encodes. Core just decodes. Same principle as Discord/Slack (whoever holds the auth token does the download).Compress before transmit —
resize_and_compress(1200px, JPEG 75) reduces typical images from 2-5MB to 200-400KB. Base64 overhead (~33%) is negligible at this size. No WebSocket pressure.posttype support — Feishu sends @mention + pasted image asmsg_type: "post"(rich text). Parser extracts text nodes as prompt andimgnodes as image attachments. This is the only way to send @mention + image in a group chat.Text files only for
filetype — Only known text extensions (.txt,.py,.rs,.md,.json, etc.) are downloaded, capped at 512KB. Binary files (.pdf,.zip) are silently ignored to avoid sending garbage to the model.Graceful degradation — If image download fails, text portion is still forwarded. If both text and attachments are empty (e.g. unsupported file type), event is not sent.
Schema backward compatible —
attachmentsuses#[serde(default)]. Old gateway (no attachments) works with new core. New gateway works with old core (attachments ignored).Known limitations
postmessage containing both.Testing
cargo testgateway — 96 passedcargo testcore — 197 passedEnd-to-end tested on Feishu with vision-capable model.
Breaking Changes
None.
attachmentsfield is additive with#[serde(default)]. Existing text-only messages are unaffected.Prior Art
/im/v1/messages/{id}/resources/{key}/im/v1/messages/{id}/resources/{key}/im/v1/messages/{id}/resources/{key}postmsg_type parsingDiscord Discussion URL
https://discord.com/channels/1491295327620169908/1500160821567684660