feat: add temporary extra user content parts#7976
Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
dump_messages_with_checkpoints, consider delegating the persistence filtering to a method onContentPart(e.g.,should_persist()ormodel_dump_for_persistence()) instead of usinggetattr(part, "_no_save", False)inline, so the persistence logic stays encapsulated with the content part type. - Now that
_no_saveis a first-class concept, it may be worth auditing other places whereContentPartinstances are serialized to ensure they either usemodel_dump_for_context()or an equivalent API, so temporary parts are not accidentally persisted or sent to providers in unintended contexts.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `dump_messages_with_checkpoints`, consider delegating the persistence filtering to a method on `ContentPart` (e.g., `should_persist()` or `model_dump_for_persistence()`) instead of using `getattr(part, "_no_save", False)` inline, so the persistence logic stays encapsulated with the content part type.
- Now that `_no_save` is a first-class concept, it may be worth auditing other places where `ContentPart` instances are serialized to ensure they either use `model_dump_for_context()` or an equivalent API, so temporary parts are not accidentally persisted or sent to providers in unintended contexts.
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/message.py" line_range="53-57" />
<code_context>
raise ValueError(f"Cannot validate {value} as ContentPart")
target_class = cls.__content_part_registry[type_value]
- return target_class.model_validate(value)
+ part = target_class.model_validate(value)
+ if cast(dict[str, Any], value).get("_no_save"):
+ part._no_save = True
+ return part
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The `_no_save` flag is treated as truthy rather than explicitly boolean, which can lead to surprising behavior.
`cast(dict[str, Any], value).get("_no_save")` will treat any truthy value (e.g. `'false'`, `1`, non-empty strings) as enabling `_no_save`. If this field is meant to be strictly boolean, consider checking explicitly for `True` (or validating/coercing the type) to avoid accidental activation from loosely typed inputs.
```suggestion
target_class = cls.__content_part_registry[type_value]
part = target_class.model_validate(value)
if cast(dict[str, Any], value).get("_no_save") is True:
part._no_save = True
return part
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| target_class = cls.__content_part_registry[type_value] | ||
| return target_class.model_validate(value) | ||
| part = target_class.model_validate(value) | ||
| if cast(dict[str, Any], value).get("_no_save"): | ||
| part._no_save = True | ||
| return part |
There was a problem hiding this comment.
suggestion (bug_risk): The _no_save flag is treated as truthy rather than explicitly boolean, which can lead to surprising behavior.
cast(dict[str, Any], value).get("_no_save") will treat any truthy value (e.g. 'false', 1, non-empty strings) as enabling _no_save. If this field is meant to be strictly boolean, consider checking explicitly for True (or validating/coercing the type) to avoid accidental activation from loosely typed inputs.
| target_class = cls.__content_part_registry[type_value] | |
| return target_class.model_validate(value) | |
| part = target_class.model_validate(value) | |
| if cast(dict[str, Any], value).get("_no_save"): | |
| part._no_save = True | |
| return part | |
| target_class = cls.__content_part_registry[type_value] | |
| part = target_class.model_validate(value) | |
| if cast(dict[str, Any], value).get("_no_save") is True: | |
| part._no_save = True | |
| return part |
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to mark specific message content parts as temporary, ensuring they are included in the LLM context but excluded from persistent conversation history. Key changes include the addition of a _no_save attribute and a mark_as_temp method to the ContentPart class, along with logic in the message dumping process to filter these parts. Documentation and unit tests have been updated to reflect these changes. Feedback was provided to further refine the persistence logic by accounting for message-level temporary flags and preventing the storage of messages that result in empty content after filtering.
| message_data = message.model_dump() | ||
| if isinstance(message.content, list): | ||
| message_data["content"] = [ | ||
| part.model_dump() | ||
| for part in message.content | ||
| if not getattr(part, "_no_save", False) | ||
| ] | ||
| dumped.append(message_data) |
There was a problem hiding this comment.
在 dump_messages_with_checkpoints 函数中,目前的实现存在两个可以改进的地方:
- 未检查消息级别的 _no_save 属性:Message 类本身也定义了 _no_save 私有属性,且在 bind_checkpoint_messages 中会被还原。如果整个消息被标记为临时消息,此处应当跳过该消息的持久化。
- 空内容消息的处理:如果 message.content 是一个列表,且其中的所有 ContentPart 都被标记为 _no_save,过滤后的 content 将变成空列表 []。对于没有工具调用(tool_calls)的消息,保存一个空内容的消息通常没有意义。
建议在消息被标记为 _no_save 或过滤后内容为空且无工具调用时跳过该消息。此外,根据项目规则,新功能的实现应伴随相应的单元测试。
| message_data = message.model_dump() | |
| if isinstance(message.content, list): | |
| message_data["content"] = [ | |
| part.model_dump() | |
| for part in message.content | |
| if not getattr(part, "_no_save", False) | |
| ] | |
| dumped.append(message_data) | |
| if message._no_save: | |
| continue | |
| message_data = message.model_dump() | |
| if isinstance(message.content, list): | |
| filtered_content = [ | |
| part.model_dump() | |
| for part in message.content | |
| if not getattr(part, "_no_save", False) | |
| ] | |
| if not filtered_content and not message.tool_calls: | |
| continue | |
| message_data["content"] = filtered_content | |
| dumped.append(message_data) |
References
- New functionality should be accompanied by corresponding unit tests.
Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Support temporary user content parts that affect only the current LLM request without being persisted to conversation history.
New Features:
Enhancements:
Documentation:
.mark_as_temp()for extra user content parts in both English and Chinese developer guides.Tests: