Skip to content

fix(adk): preserve multimodal content fields in rewriteMessage#847

Merged
shentongmartin merged 3 commits intomainfrom
fix/rewrite_msg
Mar 12, 2026
Merged

fix(adk): preserve multimodal content fields in rewriteMessage#847
shentongmartin merged 3 commits intomainfrom
fix/rewrite_msg

Conversation

@shentongmartin
Copy link
Copy Markdown
Contributor

Problem

When a ChatModelAgent inside a WorkflowAgent produces a response, its output message is rewritten as a User-role history entry for subsequent agents in the workflow. The previous implementation of rewriteMessage created a brand-new schema.UserMessage(text) from only the text content and tool calls, silently discarding all other message fields — including multimodal content (MultiContent, UserInputMultiContent, AssistantGenMultiContent).

Summary

Problem Solution
rewriteMessage dropped all fields except Content and ToolCalls Copy MultiContent and UserInputMultiContent as new slices; convert AssistantGenMultiContent to UserInputMultiContent
AssistantGenMultiContent (output parts) incompatible with User-role message Convert each output part (text/image/audio/video) to the corresponding MessageInputPart; drop Reasoning parts which have no user-input equivalent
Copied slices shared backing array with original message Use append([]T(nil), src...) to give each rewritten message its own independent slice

Key Insight

Role boundary requires type conversion, not just copy.
AssistantGenMultiContent uses MessageOutputPart (with MessageOutputImage, MessageOutputAudio, etc.), while a User message expects MessageInputPart (with MessageInputImage, MessageInputAudio, etc.). Both share MessagePartCommon as their embedded base, so the conversion is lossless for all media types. Reasoning parts have no input-side equivalent and are intentionally dropped — they describe the model's internal thinking process, which is not meaningful as user context.

Slice independence matters for history safety.
The rewritten message is inserted into the history that may be retained and replayed across workflow iterations. A direct slice assignment would alias the backing array of the original message, risking cross-message mutation. The fix copies slices with fresh backing arrays while sharing inner pointer fields (*MessageInputImage, etc.), which are treated as immutable once set.

Test

Added TestRewriteMessage in adk/flow_test.go covering:

  • MultiContent copied independently (mutation of copy doesn't affect original)
  • Pre-existing UserInputMultiContent copied independently
  • All four AssistantGenMultiContent part types (text, image, audio, video) converted correctly with Extra and MessagePartCommon preserved
  • Reasoning parts dropped
  • AssistantGenMultiContent not set on rewritten message

rewriteMessage and genMsg both reach 100% statement coverage.


问题

WorkflowAgent 中的 ChatModelAgent 产生响应时,其输出消息会被改写为后续 agent 的历史条目(User 角色)。原有的 rewriteMessage 实现仅从文本内容和工具调用构建新的 schema.UserMessage(text),会静默丢弃消息中的所有其他字段,包括多模态内容字段(MultiContentUserInputMultiContentAssistantGenMultiContent)。

变更概要

问题 解决方案
rewriteMessage 丢弃了除 ContentToolCalls 以外的所有字段 以新 slice 拷贝 MultiContentUserInputMultiContent;将 AssistantGenMultiContent 转换为 UserInputMultiContent
AssistantGenMultiContent(输出部分)与 User 角色消息不兼容 将每个输出 part(text/image/audio/video)转换为对应的 MessageInputPart;丢弃无用户输入等价物的 Reasoning part
拷贝的 slice 与原始消息共享底层数组 使用 append([]T(nil), src...) 为改写后的消息分配独立 slice

核心洞察

角色边界需要类型转换,而不仅仅是拷贝。
AssistantGenMultiContent 使用 MessageOutputPart(含 MessageOutputImageMessageOutputAudio 等),而 User 消息期望 MessageInputPart(含 MessageInputImageMessageInputAudio 等)。两者均以 MessagePartCommon 作为嵌入基类,因此所有媒体类型的转换是无损的。Reasoning part 没有输入侧等价物,被有意丢弃——它描述的是模型的内部思考过程,作为用户上下文没有意义。

slice 独立性对历史安全性至关重要。
改写后的消息会被插入历史记录,在 workflow 迭代中可能被保留和重放。直接赋值 slice 会与原始消息共享底层数组,存在跨消息数据被修改的风险。修复方案通过拷贝 slice 分配新的底层数组,同时共享内部指针字段(如 *MessageInputImage)——这些字段一旦设置就被视为不可变。

When rewriteMessage rewrites a ChatModelAgent's output as a User message
for history context, it now carries over MultiContent, UserInputMultiContent,
and converts AssistantGenMultiContent to UserInputMultiContent (text/image/
audio/video parts). Reasoning parts are dropped as they have no user input
equivalent.

Change-Id: If645454287b3fb5da0634b71b2e2eed7a3692c08
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.60%. Comparing base (a838e5c) to head (3094630).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #847      +/-   ##
==========================================
+ Coverage   80.53%   80.60%   +0.06%     
==========================================
  Files         129      129              
  Lines       13095    13129      +34     
==========================================
+ Hits        10546    10582      +36     
+ Misses       1749     1747       -2     
  Partials      800      800              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Change-Id: Ib433159ec54876d295e6c43544f4cae2f3b1789e
Change-Id: I90080a89c96ab34f0620c2cef011ab6f7e270973
@shentongmartin shentongmartin merged commit 1119cc0 into main Mar 12, 2026
19 checks passed
@shentongmartin shentongmartin deleted the fix/rewrite_msg branch March 12, 2026 05:34
meguminnnnnnnnn pushed a commit that referenced this pull request Mar 13, 2026
* fix(adk): preserve multimodal content fields in rewriteMessage

When rewriteMessage rewrites a ChatModelAgent's output as a User message
for history context, it now carries over MultiContent, UserInputMultiContent,
and converts AssistantGenMultiContent to UserInputMultiContent (text/image/
audio/video parts). Reasoning parts are dropped as they have no user input
equivalent.

Change-Id: If645454287b3fb5da0634b71b2e2eed7a3692c08

* fix(adk): use composite literal for slice copy in rewriteMessage

Change-Id: Ib433159ec54876d295e6c43544f4cae2f3b1789e

* fix(adk): preserve nil slice semantics in rewriteMessage

Change-Id: I90080a89c96ab34f0620c2cef011ab6f7e270973
shentongmartin added a commit that referenced this pull request Mar 17, 2026
* fix(adk): preserve multimodal content fields in rewriteMessage

When rewriteMessage rewrites a ChatModelAgent's output as a User message
for history context, it now carries over MultiContent, UserInputMultiContent,
and converts AssistantGenMultiContent to UserInputMultiContent (text/image/
audio/video parts). Reasoning parts are dropped as they have no user input
equivalent.

Change-Id: If645454287b3fb5da0634b71b2e2eed7a3692c08

* fix(adk): use composite literal for slice copy in rewriteMessage

Change-Id: Ib433159ec54876d295e6c43544f4cae2f3b1789e

* fix(adk): preserve nil slice semantics in rewriteMessage

Change-Id: I90080a89c96ab34f0620c2cef011ab6f7e270973
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants