feat(gateway): media proxy colocate mode — filesystem store replaces base64 inline#858
Conversation
|
All PRs must reference a prior Discord discussion to ensure community alignment before implementation. Please edit the PR description to include a link like: This PR will be automatically closed in 3 days if the link is not added. |
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening reportscreened PR #858, posted the GitHub comment, and moved the project item to `PR-Screening`.GitHub comment: #858 (comment) IntentPR #858 is trying to replace inline base64 attachment transport between Gateway and Core with co-located filesystem transport. The operator-visible problem is that large or non-image media currently bloats WebSocket messages, creates memory pressure, introduces inconsistent adapter behavior, and forces practical size/type limits. FeatFeature work. Gateway gains a media store under Current implementation appears partial: it adds Who It ServesPrimary beneficiaries are agent runtime operators and deployers running co-located Gateway/Core deployments. Secondary beneficiaries are Discord, LINE, Telegram, and Feishu users who need reliable media/file delivery without silent drops or payload limits caused by base64 transport. Rewritten PromptImplement co-located filesystem media transport for OpenAB Gateway attachments. Add a Gateway media store that writes inbound platform media to Update LINE, Telegram, and Feishu inbound media handling so adapters download authenticated media, store it through the shared media store, and emit attachments with Acceptance criteria: a Gateway/Core deployment sharing Merge PitchThis should move forward because base64-over-WS is the wrong long-term transport for authenticated platform media. The direction is sound: it reduces bandwidth, avoids large-frame backpressure, allows non-text/non-image files, and matches the co-located deployment model OpenAB already assumes in many gateway setups. Risk is medium-high until the missing integration lands. The likely reviewer concern is not the schema addition, but incomplete behavior: absolute path trust boundaries, same-HOME assumptions, cleanup races, file lifetime vs Core read timing, and whether separated Gateway/Core deployments degrade safely. Best-Practice ComparisonOpenClaw applies well here. The PR intentionally mirrors OpenClaw local media directory pattern, but it should also carry over the reliability details: durable-enough handoff semantics, explicit cleanup policy, clear run/read logs on failures, and delivery routing that does not assume a path is valid outside the local execution boundary. Hermes Agent only partially applies. Its in-process memory approach is less relevant because OpenAB agent/Core boundary is external and Gateway media often requires platform-authenticated download. The useful Hermes comparison is the preference for self-contained handoff data and clean lifecycle ownership; OpenAB should define which process owns media cleanup and how Core reports missing/expired files. Implementation OptionsConservative: keep base64 as the default transport, land only schema fields plus media store behind an opt-in flag, and add tests/docs. This reduces merge risk but does not solve current adapter payload pressure yet. Balanced: complete co-located path mode for the adapters listed in the PR, make Core prefer Ambitious: implement both co-located path mode and remote HTTP proxy mode now, with signed/short-lived media URLs, explicit media IDs instead of raw absolute paths, structured cleanup state, and adapter-wide migration. This gives the best architecture for separated deployments but is too large for this PR unless split aggressively. Comparison Table
RecommendationTake the balanced path, but do not merge this PR while it is only schema plus Sequencing: merge the complete co-located implementation first, then add separated-deployment proxy support, then remove the base64 decode path after at least one release cycle with telemetry/log evidence that path mode is stable. |
…cate base64 inline Replace base64-over-WebSocket media transport with local filesystem store. Gateway downloads media from platform APIs and writes to ~/.openab/media/inbound/<uuid>, passing the file path to Core via the WS event. Core reads bytes directly from disk — zero encoding overhead, no WS payload bloat. Key changes: - gateway/src/store.rs: file store with 2-min TTL eviction (OpenClaw pattern) - gateway/src/media.rs: shared image resize/compress + MediaKind enum - gateway/src/schema.rs: Attachment gains optional 'path' field - gateway/src/adapters/telegram.rs: inbound photo/voice/audio/document support - src/gateway.rs: Core reads from path (colocate) with base64 fallback Security: UUID-only filenames (no path traversal), platform tokens never reach Core, TTL auto-eviction prevents disk exhaustion, colocate trust boundary documented. Supersedes #757 (base64 inline approach). Closes #690.
efb9f2a to
4a8e7a3
Compare
Prevents future callers from accidentally writing unbounded files. Matches AUDIO_MAX_DOWNLOAD as the largest allowed media type.
14daca4 to
7847a0a
Compare
LINE adapter: - Support image and audio message types (same pattern as Telegram) - Download via LINE Content API, resize images, store to filesystem - Derive audio extension from content_type (mp3/ogg/m4a) - Empty event guard media_store.rs: - Add 20MB hard cap inside store_media() as defense-in-depth - Future callers cannot accidentally write unbounded files Addresses review feedback from 普渡法師 on PR #858.
|
Ready for review 🙏 Changes since last review:
Both findings from 普渡法師 addressed. Requesting review from @thepagent and @wangyuyan-agent. |
14daca4 to
cc85846
Compare
Feishu, Google Chat, and WeCom adapters now use store::store_media() instead of base64 encoding. All media flows through the same ~/.openab/media/inbound/<uuid> path — consistent across all platforms. No adapter left on base64 inline.
cc85846 to
74f3d94
Compare
Replace base64 references with filesystem store description. Add Telegram inbound media section (images, documents, audio/voice).
ec16ae8 to
8f1ccec
Compare
…ence Covers architecture, platform support matrix, processing pipeline, size limits, storage security, and future HTTP proxy roadmap.
47e2afb to
aadf507
Compare
…pressure With colocate mode, files go to disk not WS payload. The 512KB limit was a base64-era constraint. Now unified at 20MB (same as store cap). Core decides how much to read/truncate.
df7ab3d to
81ef91a
Compare
Gateway needs write access to ~/.openab/media/inbound/ for media proxy colocate mode (PR #858). Both core and gateway now share the PVC.
Gateway needs write access to ~/.openab/media/inbound/ for media proxy colocate mode (PR #858). Both core and gateway now share the PVC.
* feat: add openab-telegram chart (colocated OAB + gateway + cloudflared) Single-pod Helm chart for Telegram deployments: - OAB agent, gateway, and cloudflared tunnel as colocated containers - Shared emptyDir for /tmp, PVC for agent persistence - Only 2 required --set flags: telegramBotToken, cloudflareTunnelToken - Follows the reference architecture from docs/refarch/telegram-cloudflare-tunnel.md Closes #872 * feat(openab-telegram): add release channel (beta/stable) support - channel: stable (default) strips -beta.* from appVersion for both images - channel: beta uses appVersion as-is for core, strips prerelease for gateway (gateway has no beta tags) - Explicit image.tag / gateway.tag override still takes precedence * fix(openab-telegram): pin gateway to v0.5.0, simplify helper Gateway has independent release cadence from core — no appVersion derivation. Just use the pinned tag directly. * feat(openab-telegram): add existingSecret support + credential management README - existingSecret: reference a pre-created K8s Secret (skips chart Secret creation) - README documents 3 credential options: --set, --from-literal, --from-env-file - Secrets from external managers (AWS SM) can flow to K8s without touching disk * fix(openab-telegram): address review findings - Pin cloudflared to 2026.5.0 (was 'latest') - Change agent.command default to 'openab' (generic, not kiro-specific) - Fix NOTES.txt webhook curl to respect existingSecret * fix(openab-telegram): mount shared PVC in gateway container Gateway needs write access to ~/.openab/media/inbound/ for media proxy colocate mode (PR #858). Both core and gateway now share the PVC. * docs(openab-telegram): add ASCII architecture diagram to README * docs(openab-telegram): add Prerequisites section with CLI-only tunnel setup * docs(openab-telegram): make README fully headless/CLI-only - Cloudflare tunnel setup via API token (no browser) - Ingress config via local config.yml - Webhook setup moved to Prerequisites (before helm install) - Post-install only has agent auth (device flow) - Fixed agent command to 'openab' * chore: bump gateway tag to v0.5.1 * refactor: use floating channel tags for agent image Instead of regex-stripping beta suffix from appVersion, resolve image tag directly from channel value (stable/beta). Requires PR #878 to publish the floating tags. * chore: update appVersion to 0.8.3, fix channel comments * fix: retain PVC on helm uninstall Agent auth credentials and state live in the PVC. Without this, uninstall+reinstall requires re-authentication. * docs: add tunnel ingress config step to NOTES.txt * fix: default agent command to kiro-cli acp * docs: rewrite NOTES.txt as structured AI-friendly post-install guide * feat: support cloudflare-api-token for automated ingress config Optional third key in the K8s Secret enables AI agents to configure tunnel ingress via the Cloudflare API without external credentials. NOTES.txt extracts all needed values from the secret itself. * docs: add remote-mode ingress config and AI-assisted install prompt --------- Co-authored-by: chaodu-agent <chaodu-agent@users.noreply.github.com> Co-authored-by: Pahud Hsieh <pahud@Pahuds-MacBook-Neo.local>
What problem does this solve?
The gateway currently uses base64-over-WebSocket for media transport. This has fundamental limitations:
How does it solve it?
Media Proxy (Colocate Mode) — Gateway downloads media and writes to
~/.openab/media/inbound/<uuid>. The file path is passed to Core via the WS event. Core reads bytes directly from disk.Architecture
Why colocate mode?
Gateway runs as a sidecar in the same pod as Core — they share
$HOME. Simplest, fastest path: no HTTP proxy, no shared PVC config, just filesystem I/O.Prior Art
~/.openclaw/media/inbound/<uuid>with 2-min TTL — same pattern we adoptSecurity Considerations
Documented in
gateway/src/store.rscomments:store_media()(defense-in-depth)Platform Support Matrix
Implementation
gateway/src/store.rsgateway/src/media.rsMediaKindenum (new)gateway/src/schema.rsAttachment.path: Option<String>fieldgateway/src/adapters/telegram.rsgateway/src/adapters/feishu.rsgateway/src/adapters/googlechat.rsgateway/src/adapters/wecom.rsgateway/src/main.rsreqwest::Client, eviction task spawnsrc/gateway.rspath(preferred) with base64datafallbackdocs/inbound-attachments.mddocs/telegram.mddocs/feishu.mddocs/google-chat.mdSupersedes / Closes
Future Roadmap
~/.openab/media/outbound/for agent → user file sendsTest Plan
cargo check— gateway + core passcargo test— 170 gateway tests pass (store, media, adapters)Thread: 1506327876427845678