feat(wecom): support inbound and outbound file attachments by XF-FS · Pull Request #190 · AgentFlocks/flocks

XF-FS · 2026-04-25T04:02:03Z

Summary

add inbound and outbound file attachment support for the WeCom channel
download and decrypt inbound WeCom media using the SDK aeskey
upload and send outbound WeCom media with follow-up text handling
improve mixed message parsing and attachment-related test coverage

Details

This PR upgrades the WeCom channel from placeholder-only file handling to a more complete attachment flow covering both inbound and outbound scenarios.

Inbound changes

update flocks/channel/builtin/wecom/channel.py
- preserve file metadata from inbound file and mixed messages
- extract file media URLs from mixed message items
update flocks/channel/builtin/wecom/inbound_media.py
- download inbound WeCom media with bounded size checks
- decrypt encrypted media using the message aeskey
- handle mixed message nested aeskey
- improve filename extraction from Content-Disposition
- classify failure logs for decrypt / too-large / download failures
update flocks/channel/inbound/dispatcher.py
- publish SSE updates for both FilePart and the rewritten text part
- replace placeholder file text with attached local file paths

Outbound changes

add flocks/channel/builtin/wecom/media.py
- prepare local media payloads for WeCom upload
- infer media type and filename from outbound media paths
update flocks/channel/builtin/wecom/channel.py
- add send_media() support using WeCom upload + send/reply media APIs
- handle media send followed by companion text when text is provided

Tests

expand tests/channel/test_wecom.py
- outbound media upload / reply / missing media id / exception / text+media cases
- mixed file parsing
- content-disposition filename parsing
- inbound decrypt failure / size limit / mixed nested aeskey
update tests/channel/test_channel.py
- align channel-level expectations for WeCom media capability

Why

Previously, WeCom file messages were reduced to placeholder text and outbound file sending was not supported.

This change makes WeCom attachments usable in both directions:

inbound files become decrypted local attachments available to the session pipeline
outbound files can be uploaded and sent through the WeCom channel

duguwanglong · 2026-04-27T10:51:51Z

Below is the code review for AgentFlocks/flocks#190 — feat(wecom): support inbound and outbound file attachments. Overall direction is good to merge, but the inline "placeholder text rewriting" inside dispatcher.py has several robustness and consistency issues that should be fixed first.

Summary

Aspect	Verdict
Direction	Good. Upgrades the WeCom `[文件消息]/[图片消息]` placeholders into actual local attachments and adds the missing `send_media`, mirroring the existing Feishu shape.
Compatibility	Doesn't break the Feishu path; deployments without the SDK degrade gracefully (warning only, not fatal).
Risk	Medium. `file://` URL handling and the placeholder-rewrite rule in `dispatcher.py` are flawed; mixed-message coverage is incomplete; no obvious security issues.
Tests	Reasonably comprehensive — decrypt failure, size limit, mixed nested `aeskey`, `..` filename traversal are all covered.
Recommendation	Request changes (fix the blocking items, then merge).

Must fix (Blocking)

1. `media.url.replace("file://", "")` is incorrect

flocks/channel/inbound/dispatcher.py (~L1011-L1016):

from pathlib import PurePosixPath
file_path_str = media.url.replace("file://", "")
try:
    display_path = str(PurePosixPath(file_path_str))
except Exception:
    display_path = file_path_str

download_inbound_media produces the URL via Path.resolve().as_uri(), which percent-encodes the path. For example:

A Chinese filename or one containing spaces → file:///Users/x/.flocks/.../wx_1_%E6%8A%A5%E5%91%8A.pdf
After replace("file://", "") → /Users/x/.flocks/.../wx_1_%E6%8A%A5%E5%91%8A.pdf
Any agent that later open()s that path will fail.

Also the file:// scheme actually uses three slashes (file:///), so after replace we get a leading / that happens to look right on macOS/Linux but on Windows file:///C:/x becomes /C:/x, which is wrong.

Suggested fix:

from urllib.parse import urlparse, unquote
file_path_str = unquote(urlparse(media.url).path)
display_path = file_path_str

The PurePosixPath block adds nothing here (it just does str() on the input) and should be removed.

2. The placeholder rewrite never matches mixed messages

The matching condition in dispatcher.py:

p.text in ("[文件消息]", "[图片消息]")
or p.text.startswith("[文件消息:")

But _extract_mixed produces text such as:

"看附件 [文件: report.bin]"
"[图片] 看图"
"[图片]"

None of these match. This is exactly the mixed scenario the PR specifically extends (the commit message even highlights mixed nested aeskey), yet the [文件: …] / [图片] placeholders will be passed verbatim to the agent and the Attached files:\n- /... rewrite never fires — inconsistent with the pure file/image case, and inconsistent with what the PR description claims.

Suggested fix — match a known set of placeholder substrings rather than equality:

PLACEHOLDERS = ("[文件消息]", "[图片消息]", "[图片]", "[文件]")
def _has_placeholder(text: str) -> bool:
    return (
        any(p in text for p in PLACEHOLDERS)
        or text.startswith("[文件消息:")
        or "[文件: " in text
    )

And lift the rewrite logic out of the inline try block into a module-level helper (see #3).

3. The dispatcher diff swallows errors and should be refactored

The new ~70 lines contain three layers of try / except: pass:

try:
    from flocks.session.message import TextPart
    parts = await Message.parts(...)
    for p in parts:
        if ...:
            ...
            try:
                ...publish_event...
            except Exception:
                pass
            break
except Exception:
    pass

try:
    ...publish_event(file_part)...
except Exception:
    pass

Problems:

Errors are silently dropped. If Message.parts or publish_event fails, there is zero trace in production. At minimum each except should log.warning(...).
publish_event is imported twice in adjacent blocks.
Hot-path readability. _append_user_message is now ~90 lines. Extracting an _attach_inbound_media(...) helper would help.
Implicit cross-channel behavior. Previously this code was guarded by if msg.channel_id != "feishu". The new if not msg.media_url ... check means every channel runs the rewrite logic. Today it's harmless because Feishu doesn't produce these placeholder strings, but future channels could match by accident. Prefer an explicit if msg.channel_id == "wecom" guard around the rewrite, or have each channel return a list of placeholders to replace.

4. In `send_media`, a text failure marks the media send as failed

sent = await self._ws_client.send_media_message(...)

if ctx.text:
    await self.send_text(OutboundContext(**{**vars(ctx), "media_url": None}))

self.record_message()
return DeliveryResult(...)

Both calls are inside the same try. If the companion text raises (e.g. rate limited), send_media returns success=False even though the media was already delivered. Worse, _send_with_retry upstream will then retry, and the same media will be sent again.

Suggested fix:

sent = await ...media...
if ctx.text:
    try:
        await self.send_text(OutboundContext(**{**vars(ctx), "media_url": None}))
    except Exception as e:
        log.warning("wecom.send_media.companion_text_failed", {"error": str(e)})
return DeliveryResult(... success=True ...)

Side note: OutboundContext is a @dataclass, so {**vars(ctx), "media_url": None} works, but dataclasses.replace(ctx, media_url=None) is clearer and won't break if a frozen flag is added later.

Should fix (Strongly recommended)

5. `_filename_from_content_disposition` doesn't handle `filename*=UTF-8''…`

re.search(r'filename\*?=(?:UTF-8\'\')?"?([^";]+)"?', value, re.I)

It prefers plain filename and ignores RFC 5987 filename*= (the encoded UTF-8 form).
It can't handle quoted values containing semicolons such as filename="my; weird.bin".

WeCom itself probably won't emit filename*=, but proxies often rewrite this header. Suggested rewrite:

def _filename_from_content_disposition(value: str) -> Optional[str]:
    # RFC 5987 first
    m = re.search(r"filename\*\s*=\s*(?:UTF-8|utf-8)''([^;]+)", value)
    if m:
        return unquote(m.group(1).strip())
    m = re.search(r'filename\s*=\s*"([^"]+)"', value)
    if m:
        return m.group(1)
    m = re.search(r'filename\s*=\s*([^;]+)', value)
    if m:
        return m.group(1).strip()
    return None

6. `_download_file_limited` distinguishes `ValueError` by string match — fragile

try:
    if int(content_length) > max_bytes:
        raise _max_size_error(max_bytes)
except ValueError as e:
    if "invalid literal" not in str(e):
        raise

WeComInboundMediaTooLarge is itself a ValueError subclass; using a substring like "invalid literal" to disambiguate is an anti-pattern. Suggested fix:

try:
    advertised = int(content_length)
except (TypeError, ValueError):
    advertised = None
if advertised is not None and advertised > max_bytes:
    raise _max_size_error(max_bytes)

7. `media.py` `_path_from_media_url` mis-detects Windows paths

parsed = urlparse(media_url)
if parsed.scheme == "file": ...
if parsed.scheme: return None
return Path(media_url)

A Windows absolute path C:\foo\bar.pdf parses as scheme='c', so it gets rejected as "unsupported scheme."

Suggested fix: detect a local path first (e.g. os.path.isabs(media_url) or Path(media_url).drive), and only fall back to urlparse if that fails.

8. `_fetch_http_file_limited` doesn't guard a non-numeric `Content-Length`

content_length = resp.headers.get("content-length")
if content_length and int(content_length) > max_bytes:
    raise ValueError(...)

Misbehaving proxies sometimes send non-numeric values; wrap the int() call in try/except (TypeError, ValueError).

9. `_sanitize_filename` is too aggressive on Chinese filenames

re.sub(r"[^A-Za-z0-9._-]+", "_", name.strip())

This collapses entire Chinese filenames into underscores, which is bad UX for enterprise users — the on-disk filename loses all signal about its origin. Feishu has the same long-standing issue, but since this PR significantly expands the amount of media being persisted to disk for WeCom, consider relaxing to \w or preserving Unicode (only stripping path separators /\, control chars, and ..).

Doesn't have to ship in this PR, but worth a follow-up issue.

10. Event payload should include `messageID`

The new FilePart reuses the message.part.updated event, which currently semantically overlaps "create" and "update". That's a pre-existing convention I won't push back on here. However, the published payload only carries sessionID and part. Adding messageID makes life much easier for the front-end:

await publish_event("message.part.updated", {
    "part": part_event,
    "sessionID": session_id,
    "messageID": message.id,
})

Nice to have (Optional)

_extract_sent_message_id only inspects body.msgid / body.message_id. Some WeCom SDK variants use body.msg_id; consider adding it for resilience.
prepare_wecom_media does not constrain file:// URLs, so theoretically a constructed file:///etc/passwd could be uploaded. In the current architecture media_url comes from agent output, so this is not external input — still, an allow-list/prefix check would harden it.
download_inbound_media writes to ~/.flocks/data/channel_media/wecom/<acct>/<YYYY-MM-DD>/ with no retention policy. Feishu has the same gap. A global cleanup job would be nice eventually.
send_media's vars(ctx) reconstruction → dataclasses.replace(ctx, media_url=None) is clearer and resists future field additions.
_close_api_client silently skips when the SDK doesn't expose _client.aclose; a one-time log.debug would help debugging.

Tests

Outbound covers upload / reply / missing media-id / exception / text+media — five branches, good.
Inbound covers streaming + Content-Disposition / Content-Length rejection / decrypt failure / mixed nested aeskey — four scenarios, good.
Missing: in tests/channel/test_channel.py the dispatcher rewrite is asserted for the pure [文件消息: report.pdf] case, but there is no assertion for the mixed scenario (either "should be rewritten" or "should not match"). Please add one after fixing issue Fix: Change license #2.
Missing: a regression test ensuring success=True when media is sent successfully but the companion text fails (after fixing issue perf(webui): fix streaming first-part display lag and optimize render… #4).

Bottom line

Direction matches the existing Feishu module — good.
Must fix: file URI parsing (Update issue templates #1), mixed placeholder rewrite coverage (Fix: Change license #2), silent except: pass in dispatcher (feat update readme #3), media-vs-text failure semantics in send_media (perf(webui): fix streaming first-part display lag and optimize render… #4).
Should fix: Content-Disposition / Content-Length parsing robustness (modified readme #5, Docs/readme web preview community #6, fix(service): harden daemon lifecycle management #8), Windows path handling (fix(windows): suppress extra terminal windows when starting services #7), messageID in event payload (fix(dingtalk): resolve runner.ts crash loop and improve onboarding UX #10).
No blocking security concerns.
Good test coverage; please add the two cases listed above.

Once the four blocking items are addressed, this is good to merge.

xff added 2 commits April 25, 2026 10:55

feat(wecom): support inbound file attachments

74567d9

feat(wecom): support inbound and outbound file attachments

4a643dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(wecom): support inbound and outbound file attachments#190

feat(wecom): support inbound and outbound file attachments#190
XF-FS wants to merge 2 commits intoAgentFlocks:mainfrom
XF-FS:feat/wecom-inbound-file-support-clean

XF-FS commented Apr 25, 2026

Uh oh!

duguwanglong commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

XF-FS commented Apr 25, 2026

Summary

Details

Inbound changes

Outbound changes

Tests

Why

Uh oh!

duguwanglong commented Apr 27, 2026

Summary

Must fix (Blocking)

1. media.url.replace("file://", "") is incorrect

2. The placeholder rewrite never matches mixed messages

3. The dispatcher diff swallows errors and should be refactored

4. In send_media, a text failure marks the media send as failed

Should fix (Strongly recommended)

5. _filename_from_content_disposition doesn't handle filename*=UTF-8''…

6. _download_file_limited distinguishes ValueError by string match — fragile

7. media.py _path_from_media_url mis-detects Windows paths

8. _fetch_http_file_limited doesn't guard a non-numeric Content-Length

9. _sanitize_filename is too aggressive on Chinese filenames

10. Event payload should include messageID

Nice to have (Optional)

Tests

Bottom line

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `media.url.replace("file://", "")` is incorrect

4. In `send_media`, a text failure marks the media send as failed

5. `_filename_from_content_disposition` doesn't handle `filename*=UTF-8''…`

6. `_download_file_limited` distinguishes `ValueError` by string match — fragile

7. `media.py` `_path_from_media_url` mis-detects Windows paths

8. `_fetch_http_file_limited` doesn't guard a non-numeric `Content-Length`

9. `_sanitize_filename` is too aggressive on Chinese filenames

10. Event payload should include `messageID`