Skip to content

feat(wecom): support inbound and outbound file attachments#190

Open
XF-FS wants to merge 2 commits intoAgentFlocks:mainfrom
XF-FS:feat/wecom-inbound-file-support-clean
Open

feat(wecom): support inbound and outbound file attachments#190
XF-FS wants to merge 2 commits intoAgentFlocks:mainfrom
XF-FS:feat/wecom-inbound-file-support-clean

Conversation

@XF-FS
Copy link
Copy Markdown

@XF-FS XF-FS commented Apr 25, 2026

Summary

  • add inbound and outbound file attachment support for the WeCom channel
  • download and decrypt inbound WeCom media using the SDK aeskey
  • upload and send outbound WeCom media with follow-up text handling
  • improve mixed message parsing and attachment-related test coverage

Details

This PR upgrades the WeCom channel from placeholder-only file handling to a more complete attachment flow covering both inbound and outbound scenarios.

Inbound changes

  • update flocks/channel/builtin/wecom/channel.py

    • preserve file metadata from inbound file and mixed messages
    • extract file media URLs from mixed message items
  • update flocks/channel/builtin/wecom/inbound_media.py

    • download inbound WeCom media with bounded size checks
    • decrypt encrypted media using the message aeskey
    • handle mixed message nested aeskey
    • improve filename extraction from Content-Disposition
    • classify failure logs for decrypt / too-large / download failures
  • update flocks/channel/inbound/dispatcher.py

    • publish SSE updates for both FilePart and the rewritten text part
    • replace placeholder file text with attached local file paths

Outbound changes

  • add flocks/channel/builtin/wecom/media.py

    • prepare local media payloads for WeCom upload
    • infer media type and filename from outbound media paths
  • update flocks/channel/builtin/wecom/channel.py

    • add send_media() support using WeCom upload + send/reply media APIs
    • handle media send followed by companion text when text is provided

Tests

  • expand tests/channel/test_wecom.py

    • outbound media upload / reply / missing media id / exception / text+media cases
    • mixed file parsing
    • content-disposition filename parsing
    • inbound decrypt failure / size limit / mixed nested aeskey
  • update tests/channel/test_channel.py

    • align channel-level expectations for WeCom media capability

Why

Previously, WeCom file messages were reduced to placeholder text and outbound file sending was not supported.

This change makes WeCom attachments usable in both directions:

  • inbound files become decrypted local attachments available to the session pipeline
  • outbound files can be uploaded and sent through the WeCom channel

@duguwanglong
Copy link
Copy Markdown
Contributor

Below is the code review for AgentFlocks/flocks#190 — feat(wecom): support inbound and outbound file attachments. Overall direction is good to merge, but the inline "placeholder text rewriting" inside dispatcher.py has several robustness and consistency issues that should be fixed first.

Summary

Aspect Verdict
Direction Good. Upgrades the WeCom [文件消息]/[图片消息] placeholders into actual local attachments and adds the missing send_media, mirroring the existing Feishu shape.
Compatibility Doesn't break the Feishu path; deployments without the SDK degrade gracefully (warning only, not fatal).
Risk Medium. file:// URL handling and the placeholder-rewrite rule in dispatcher.py are flawed; mixed-message coverage is incomplete; no obvious security issues.
Tests Reasonably comprehensive — decrypt failure, size limit, mixed nested aeskey, .. filename traversal are all covered.
Recommendation Request changes (fix the blocking items, then merge).

Must fix (Blocking)

1. media.url.replace("file://", "") is incorrect

flocks/channel/inbound/dispatcher.py (~L1011-L1016):

from pathlib import PurePosixPath
file_path_str = media.url.replace("file://", "")
try:
    display_path = str(PurePosixPath(file_path_str))
except Exception:
    display_path = file_path_str

download_inbound_media produces the URL via Path.resolve().as_uri(), which percent-encodes the path. For example:

  • A Chinese filename or one containing spaces → file:///Users/x/.flocks/.../wx_1_%E6%8A%A5%E5%91%8A.pdf
  • After replace("file://", "")/Users/x/.flocks/.../wx_1_%E6%8A%A5%E5%91%8A.pdf
  • Any agent that later open()s that path will fail.

Also the file:// scheme actually uses three slashes (file:///), so after replace we get a leading / that happens to look right on macOS/Linux but on Windows file:///C:/x becomes /C:/x, which is wrong.

Suggested fix:

from urllib.parse import urlparse, unquote
file_path_str = unquote(urlparse(media.url).path)
display_path = file_path_str

The PurePosixPath block adds nothing here (it just does str() on the input) and should be removed.

2. The placeholder rewrite never matches mixed messages

The matching condition in dispatcher.py:

p.text in ("[文件消息]", "[图片消息]")
or p.text.startswith("[文件消息:")

But _extract_mixed produces text such as:

  • "看附件 [文件: report.bin]"
  • "[图片] 看图"
  • "[图片]"

None of these match. This is exactly the mixed scenario the PR specifically extends (the commit message even highlights mixed nested aeskey), yet the [文件: …] / [图片] placeholders will be passed verbatim to the agent and the Attached files:\n- /... rewrite never fires — inconsistent with the pure file/image case, and inconsistent with what the PR description claims.

Suggested fix — match a known set of placeholder substrings rather than equality:

PLACEHOLDERS = ("[文件消息]", "[图片消息]", "[图片]", "[文件]")
def _has_placeholder(text: str) -> bool:
    return (
        any(p in text for p in PLACEHOLDERS)
        or text.startswith("[文件消息:")
        or "[文件: " in text
    )

And lift the rewrite logic out of the inline try block into a module-level helper (see #3).

3. The dispatcher diff swallows errors and should be refactored

The new ~70 lines contain three layers of try / except: pass:

try:
    from flocks.session.message import TextPart
    parts = await Message.parts(...)
    for p in parts:
        if ...:
            ...
            try:
                ...publish_event...
            except Exception:
                pass
            break
except Exception:
    pass

try:
    ...publish_event(file_part)...
except Exception:
    pass

Problems:

  • Errors are silently dropped. If Message.parts or publish_event fails, there is zero trace in production. At minimum each except should log.warning(...).
  • publish_event is imported twice in adjacent blocks.
  • Hot-path readability. _append_user_message is now ~90 lines. Extracting an _attach_inbound_media(...) helper would help.
  • Implicit cross-channel behavior. Previously this code was guarded by if msg.channel_id != "feishu". The new if not msg.media_url ... check means every channel runs the rewrite logic. Today it's harmless because Feishu doesn't produce these placeholder strings, but future channels could match by accident. Prefer an explicit if msg.channel_id == "wecom" guard around the rewrite, or have each channel return a list of placeholders to replace.

4. In send_media, a text failure marks the media send as failed

sent = await self._ws_client.send_media_message(...)

if ctx.text:
    await self.send_text(OutboundContext(**{**vars(ctx), "media_url": None}))

self.record_message()
return DeliveryResult(...)

Both calls are inside the same try. If the companion text raises (e.g. rate limited), send_media returns success=False even though the media was already delivered. Worse, _send_with_retry upstream will then retry, and the same media will be sent again.

Suggested fix:

sent = await ...media...
if ctx.text:
    try:
        await self.send_text(OutboundContext(**{**vars(ctx), "media_url": None}))
    except Exception as e:
        log.warning("wecom.send_media.companion_text_failed", {"error": str(e)})
return DeliveryResult(... success=True ...)

Side note: OutboundContext is a @dataclass, so {**vars(ctx), "media_url": None} works, but dataclasses.replace(ctx, media_url=None) is clearer and won't break if a frozen flag is added later.


Should fix (Strongly recommended)

5. _filename_from_content_disposition doesn't handle filename*=UTF-8''…

re.search(r'filename\*?=(?:UTF-8\'\')?"?([^";]+)"?', value, re.I)
  • It prefers plain filename and ignores RFC 5987 filename*= (the encoded UTF-8 form).
  • It can't handle quoted values containing semicolons such as filename="my; weird.bin".

WeCom itself probably won't emit filename*=, but proxies often rewrite this header. Suggested rewrite:

def _filename_from_content_disposition(value: str) -> Optional[str]:
    # RFC 5987 first
    m = re.search(r"filename\*\s*=\s*(?:UTF-8|utf-8)''([^;]+)", value)
    if m:
        return unquote(m.group(1).strip())
    m = re.search(r'filename\s*=\s*"([^"]+)"', value)
    if m:
        return m.group(1)
    m = re.search(r'filename\s*=\s*([^;]+)', value)
    if m:
        return m.group(1).strip()
    return None

6. _download_file_limited distinguishes ValueError by string match — fragile

try:
    if int(content_length) > max_bytes:
        raise _max_size_error(max_bytes)
except ValueError as e:
    if "invalid literal" not in str(e):
        raise

WeComInboundMediaTooLarge is itself a ValueError subclass; using a substring like "invalid literal" to disambiguate is an anti-pattern. Suggested fix:

try:
    advertised = int(content_length)
except (TypeError, ValueError):
    advertised = None
if advertised is not None and advertised > max_bytes:
    raise _max_size_error(max_bytes)

7. media.py _path_from_media_url mis-detects Windows paths

parsed = urlparse(media_url)
if parsed.scheme == "file": ...
if parsed.scheme: return None
return Path(media_url)

A Windows absolute path C:\foo\bar.pdf parses as scheme='c', so it gets rejected as "unsupported scheme."

Suggested fix: detect a local path first (e.g. os.path.isabs(media_url) or Path(media_url).drive), and only fall back to urlparse if that fails.

8. _fetch_http_file_limited doesn't guard a non-numeric Content-Length

content_length = resp.headers.get("content-length")
if content_length and int(content_length) > max_bytes:
    raise ValueError(...)

Misbehaving proxies sometimes send non-numeric values; wrap the int() call in try/except (TypeError, ValueError).

9. _sanitize_filename is too aggressive on Chinese filenames

re.sub(r"[^A-Za-z0-9._-]+", "_", name.strip())

This collapses entire Chinese filenames into underscores, which is bad UX for enterprise users — the on-disk filename loses all signal about its origin. Feishu has the same long-standing issue, but since this PR significantly expands the amount of media being persisted to disk for WeCom, consider relaxing to \w or preserving Unicode (only stripping path separators /\, control chars, and ..).

Doesn't have to ship in this PR, but worth a follow-up issue.

10. Event payload should include messageID

The new FilePart reuses the message.part.updated event, which currently semantically overlaps "create" and "update". That's a pre-existing convention I won't push back on here. However, the published payload only carries sessionID and part. Adding messageID makes life much easier for the front-end:

await publish_event("message.part.updated", {
    "part": part_event,
    "sessionID": session_id,
    "messageID": message.id,
})

Nice to have (Optional)

  • _extract_sent_message_id only inspects body.msgid / body.message_id. Some WeCom SDK variants use body.msg_id; consider adding it for resilience.
  • prepare_wecom_media does not constrain file:// URLs, so theoretically a constructed file:///etc/passwd could be uploaded. In the current architecture media_url comes from agent output, so this is not external input — still, an allow-list/prefix check would harden it.
  • download_inbound_media writes to ~/.flocks/data/channel_media/wecom/<acct>/<YYYY-MM-DD>/ with no retention policy. Feishu has the same gap. A global cleanup job would be nice eventually.
  • send_media's vars(ctx) reconstructiondataclasses.replace(ctx, media_url=None) is clearer and resists future field additions.
  • _close_api_client silently skips when the SDK doesn't expose _client.aclose; a one-time log.debug would help debugging.

Tests

  • Outbound covers upload / reply / missing media-id / exception / text+media — five branches, good.
  • Inbound covers streaming + Content-Disposition / Content-Length rejection / decrypt failure / mixed nested aeskey — four scenarios, good.
  • Missing: in tests/channel/test_channel.py the dispatcher rewrite is asserted for the pure [文件消息: report.pdf] case, but there is no assertion for the mixed scenario (either "should be rewritten" or "should not match"). Please add one after fixing issue Fix: Change license #2.
  • Missing: a regression test ensuring success=True when media is sent successfully but the companion text fails (after fixing issue perf(webui): fix streaming first-part display lag and optimize render… #4).

Bottom line

Once the four blocking items are addressed, this is good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants