Improve Telegram guard visibility and reduce token bloat#18
Improve Telegram guard visibility and reduce token bloat#18Vardominator merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new transcript truncation plugin and expands the Telegram group allowlist guard to provide DM “awareness” of recent/known groups, while preserving fail-closed behavior for non-allowlisted group triggers.
Changes:
- Introduces
transcript-size-guardto truncate oversized persisted tool result content/details. - Extends
telegram-group-allowlist-guardwith abefore_prompt_buildhook that injects recent/known group awareness context into DMs. - Updates plugin schemas and adds/extends Bun tests for both plugins.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/transcript-size-guard.test.ts | Adds coverage for truncating toolResult content/details and tool allowlist behavior. |
| tests/telegram-group-allowlist-guard.test.ts | Adds coverage for DM tool execution + DM awareness context injection behavior. |
| instances/core-human/config/extensions/transcript-size-guard/openclaw.plugin.json | Defines plugin metadata and config schema for truncation settings. |
| instances/core-human/config/extensions/transcript-size-guard/index.ts | Implements truncation for persisted toolResult content/details. |
| instances/core-human/config/extensions/telegram-group-allowlist-guard/openclaw.plugin.json | Extends schema/UI hints for awareness-related config options. |
| instances/core-human/config/extensions/telegram-group-allowlist-guard/index.ts | Adds structured Telegram content text extraction, group awareness tracking, and DM prompt context injection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const DEFAULT_MAX_SENDER_AGE_MS = 5 * 60 * 1000; | ||
| const DEFAULT_AWARENESS_SENDER_AGE_MS = 24 * 60 * 60 * 1000; | ||
| const DEFAULT_AWARENESS_MAX_GROUPS = 8; | ||
| const DEFAULT_STATE_AGENTS_DIR = "/var/lib/openclaw/state/agents"; | ||
| const lastInboundByConversation = new Map<string, LastInboundSender>(); | ||
| const groupAwarenessByConversation = new Map<string, GroupAwareness>(); | ||
|
|
There was a problem hiding this comment.
groupAwarenessByConversation retains entries for up to awarenessSenderAgeMs (default 24h) and is only pruned by age. In high-traffic deployments or with many distinct group targets, this map can grow without a hard cap and increase memory usage over time. Consider adding a maximum entry count / LRU-style eviction (in addition to TTL) to keep memory bounded.
| const label = group.groupSubject ?? group.conversationLabel ?? `id:${group.groupId}`; | ||
| lines.push(` - ${label} (id:${group.groupId}${seenStatus}${senderStatus})`); | ||
| if (group.lastMessagePreview) { | ||
| lines.push(` latest observed text: "${group.lastMessagePreview}"`); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| lines.push( | ||
| "- In DMs asking about group activity, never claim zero visibility if the relevant group is listed above.", | ||
| ); | ||
| lines.push( | ||
| "- If a group includes 'latest observed text', treat it as the freshest available snippet and answer from it directly without calling sessions_list first.", | ||
| ); | ||
| return lines.join("\n"); |
There was a problem hiding this comment.
The DM awareness context includes lastMessagePreview derived from untrusted group content and injects it into prependContext, which is typically high-priority prompt material. Even with truncation, this increases prompt-injection risk (group participants can place instructions that affect DM behavior). Consider adding stronger delimiting + explicit instruction to treat the snippet as quoted user content (non-executable), and/or place it in a lower-priority message section if supported by the host prompt builder.
| return { | ||
| value: | ||
| `${input.slice(0, settings.headChars)}\n` + | ||
| `...[${label} truncated ${removed} chars]...\n` + | ||
| `${input.slice(-settings.tailChars)}`, |
There was a problem hiding this comment.
truncateString can return a string longer than settings.maxChars because it appends a truncation marker in addition to headChars + tailChars. With configs where headChars + tailChars is close to maxChars, the persisted payload may still exceed the intended size cap. Consider enforcing the final output length to be <= maxChars (e.g., compute an overhead budget and shrink head/tail accordingly, or post-trim the assembled string).
| return { | |
| value: | |
| `${input.slice(0, settings.headChars)}\n` + | |
| `...[${label} truncated ${removed} chars]...\n` + | |
| `${input.slice(-settings.tailChars)}`, | |
| const head = input.slice(0, settings.headChars); | |
| const marker = `...[${label} truncated ${removed} chars]...\n`; | |
| const tail = input.slice(-settings.tailChars); | |
| let value = `${head}\n${marker}${tail}`; | |
| if (value.length > settings.maxChars) { | |
| value = value.slice(0, settings.maxChars); | |
| } | |
| return { | |
| value, |
| try { | ||
| const raw = readFileSync(resolveSessionIndexPath(stateAgentsDir, effectiveAgentId), "utf8"); | ||
| const parsed = JSON.parse(raw); | ||
| const sessions = asRecord(parsed); |
There was a problem hiding this comment.
readIndexedTelegramGroups performs a synchronous readFileSync + JSON.parse on every DM before_prompt_build call (via buildTelegramAwarenessContext). This can add noticeable latency and block the event loop, especially if sessions.json is large or prompts are frequent. Consider caching the parsed index (with a TTL / mtime check) or moving the file IO off the hot path.
Summary
Testing