feat: add deduplication for WeChat kefu text messages within 15 seconds by Soulter · Pull Request #7788 · AstrBotDevs/AstrBot

Soulter · 2026-04-25T08:25:55Z

Modifications / 改动点

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
/ 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到 requirements.txt 和 pyproject.toml 文件相应位置。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。

Summary by Sourcery

New Features:

Ignore duplicate WeChat kefu text messages received within 15 seconds for the same session to prevent repeated handling.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The _is_duplicate_wechat_kf_text_message method does a full scan of _wechat_kf_seen_text_messages on every text message to evict expired keys; if this dict can grow large in production, consider a lighter-weight eviction strategy (e.g., periodic cleanup, capped size, or using an ordered structure).
Right now deduplication is based on session_id and text.strip() only; if messages differing only in trivial ways (e.g., whitespace changes or common punctuation variants) should also be treated as duplicates, consider centralizing a stronger normalization function for the text content before computing the dedup key.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `_is_duplicate_wechat_kf_text_message` method does a full scan of `_wechat_kf_seen_text_messages` on every text message to evict expired keys; if this dict can grow large in production, consider a lighter-weight eviction strategy (e.g., periodic cleanup, capped size, or using an ordered structure).
- Right now deduplication is based on `session_id` and `text.strip()` only; if messages differing only in trivial ways (e.g., whitespace changes or common punctuation variants) should also be treated as duplicates, consider centralizing a stronger normalization function for the text content before computing the dedup key.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/wecom/wecom_adapter.py" line_range="420-421" />
<code_context>
         if msgtype == "text":
             text = msg.get("text", {}).get("content", "").strip()
+            if self._is_duplicate_wechat_kf_text_message(abm.session_id, text):
+                logger.debug(
+                    "忽略 15 秒内重复微信客服文本消息 session_id=%s text=%s",
+                    abm.session_id,
+                    text,
</code_context>
<issue_to_address>
**suggestion:** Avoid hard-coding the TTL value in the log message.

The TTL is defined as `WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS = 15`, but the log message hard-codes `15 秒`. If the TTL changes, the log will be inaccurate.

Please format the message using the constant (e.g., `%d` or an f-string) so the log always reflects the configured TTL.

Suggested implementation:

```python
            if self._is_duplicate_wechat_kf_text_message(abm.session_id, text):
                logger.debug(
                    "忽略 %d 秒内重复微信客服文本消息 session_id=%s text=%s",
                    WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS,
                    abm.session_id,
                    text,
                )

```

If `WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS` is not defined in this module, make sure it is either:
1. Imported at the top of `wecom_adapter.py` from the module where it is defined, or
2. Defined in this module if that is where it logically belongs.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-04-25T08:26:58Z

+                logger.debug(
+                    "忽略 15 秒内重复微信客服文本消息 session_id=%s text=%s",


suggestion: Avoid hard-coding the TTL value in the log message.

The TTL is defined as WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS = 15, but the log message hard-codes 15 秒. If the TTL changes, the log will be inaccurate.

Please format the message using the constant (e.g., %d or an f-string) so the log always reflects the configured TTL.

Suggested implementation:

if self._is_duplicate_wechat_kf_text_message(abm.session_id, text): logger.debug( "忽略 %d 秒内重复微信客服文本消息 session_id=%s text=%s", WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS, abm.session_id, text, )

If WECHAT_KF_TEXT_CONTENT_DEDUP_TTL_SECONDS is not defined in this module, make sure it is either:

Imported at the top of wecom_adapter.py from the module where it is defined, or

Defined in this module if that is where it logically belongs.

gemini-code-assist

Code Review

This pull request implements a deduplication mechanism for WeChat KF text messages using a 15-second TTL to prevent redundant processing. Feedback includes suggestions to use a tuple for the deduplication key to avoid collisions, updating the type hint accordingly, and addressing potential performance issues in the cleanup logic which currently iterates over the entire cache for every message. Additionally, the reviewer requested unit tests to verify the new functionality.

gemini-code-assist · 2026-04-25T08:28:33Z


        self.server = WecomServer(self._event_queue, self.config)
        self.agent_id: str | None = None
+        self._wechat_kf_seen_text_messages: dict[str, float] = {}


Update the type hint to reflect the use of a tuple key for deduplication, as suggested in the logic below.

Suggested change

self._wechat_kf_seen_text_messages: dict[str, float] = {}

self._wechat_kf_seen_text_messages: dict[tuple[str, str], float] = {}

gemini-code-assist · 2026-04-25T08:28:33Z


        self.server.callback = callback

+    def _is_duplicate_wechat_kf_text_message(self, session_id: str, text: str) -> bool:


New functionality should be accompanied by corresponding unit tests. Please add tests to verify the deduplication logic, including TTL expiration and session-based isolation.

References

New functionality should be accompanied by corresponding unit tests.

gemini-code-assist · 2026-04-25T08:28:33Z

+        expired_keys = [
+            key
+            for key, expires_at in self._wechat_kf_seen_text_messages.items()
+            if expires_at <= now
+        ]
+        for key in expired_keys:
+            self._wechat_kf_seen_text_messages.pop(key, None)


The current cleanup logic iterates through the entire _wechat_kf_seen_text_messages dictionary on every incoming text message. This is an O(N) operation that runs on the main event loop. While the 15-second TTL likely keeps the dictionary small, a high volume of unique messages could lead to performance degradation and block the event loop. Consider cleaning up only periodically or limiting the maximum size of the deduplication cache.

gemini-code-assist · 2026-04-25T08:28:33Z

+        for key in expired_keys:
+            self._wechat_kf_seen_text_messages.pop(key, None)
+
+        dedup_key = f"{session_id}:{normalized_text}"


Using string concatenation with a colon as a key can lead to collisions if the session_id (which is the external_userid) contains a colon. Using a tuple (session_id, normalized_text) is a safer and more idiomatic way to create a composite key in Python.

Suggested change

dedup_key = f"{session_id}:{normalized_text}"

dedup_key = (session_id, normalized_text)

feat: add deduplication for WeChat kefu text messages within 15 seconds

ff67868

auto-assign Bot requested review from Fridemn and advent259141 April 25, 2026 08:25

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. labels Apr 25, 2026

Soulter merged commit 5d79c99 into master Apr 25, 2026
20 checks passed

sourcery-ai Bot reviewed Apr 25, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add deduplication for WeChat kefu text messages within 15 seconds#7788

feat: add deduplication for WeChat kefu text messages within 15 seconds#7788
Soulter merged 1 commit intomasterfrom
fix/wecom-kf

Soulter commented Apr 25, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Apr 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		logger.debug(
		"忽略 15 秒内重复微信客服文本消息 session_id=%s text=%s",

	self._wechat_kf_seen_text_messages: dict[str, float] = {}
	self._wechat_kf_seen_text_messages: dict[tuple[str, str], float] = {}


		self.server.callback = callback

		def _is_duplicate_wechat_kf_text_message(self, session_id: str, text: str) -> bool:

	dedup_key = f"{session_id}:{normalized_text}"
	dedup_key = (session_id, normalized_text)

Uh oh!

Conversation

Soulter commented Apr 25, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Uh oh!

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Soulter commented Apr 25, 2026 •

edited by sourcery-ai Bot

Loading