Skip to content

支持 Web 多模态图片上传并修正图片 token 预算估算#705

Open
Yumiue wants to merge 4 commits into
1024XEngineer:mainfrom
Yumiue:codex/gateway-plan-approval-rpc
Open

支持 Web 多模态图片上传并修正图片 token 预算估算#705
Yumiue wants to merge 4 commits into
1024XEngineer:mainfrom
Yumiue:codex/gateway-plan-approval-rpc

Conversation

@Yumiue
Copy link
Copy Markdown
Collaborator

@Yumiue Yumiue commented May 31, 2026

概要

  • 支持 Web 端通过 session asset 上传多模态图片。
  • 图片通过 input_parts.media.asset_id 传入后端,不再把图片数据内联到聊天消息。
  • 会话历史保留图片附件引用,支持 Web 端回显缩略图。
  • 修正 provider 预算估算,避免把图片 base64 传输体当作 prompt token 计入。

主要改动

  • 新增 /api/session-assets 上传/读取链路,并打通 Gateway、Runtime、Session 的附件引用。
  • 扩展 Web 聊天输入框、消息展示、会话加载和协议类型,支持图片附件。
  • 新增多模态投影 token 估算:文本、工具 schema 正常估算,图片按固定单图预算估算。
  • 保持真实 provider 发送路径不变:session asset 只在请求发送阶段临时读取并编码。
  • OpenAI-compatible、DeepSeek、Qwen、GLM、MiMo、MiniMax、Anthropic、Gemini 均接入图片投影估算路径。

测试

  • go test ./internal/provider/... ./internal/gateway/... ./internal/runtime -count=1

close #700

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Copy Markdown

@fennoai fennoai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review found two issues: one ACL bypass risk on the new asset HTTP endpoints, and one browser object URL lifecycle leak in the image send flow.

Comment thread internal/gateway/network_server.go
Comment thread web/src/components/chat/ChatInput.tsx
@Yumiue
Copy link
Copy Markdown
Collaborator Author

Yumiue commented May 31, 2026

@xgopilot 补充测试率,并把更改提交PR到我提交这个PR的分支

@fennoai
Copy link
Copy Markdown

fennoai Bot commented May 31, 2026

完成

已补充测试覆盖并推送到原 PR 分支 Yumiue:codex/gateway-plan-approval-rpc

本次补充

  • 覆盖 provider 图片投影估算分支与估算错误边界。
  • 覆盖 gateway_runtime_bridge 的 session asset 保存/读取与错误路径。
  • 覆盖 session 图片 MIME 归一化、扩展名不匹配、已保存 asset 引用校验。

验证

  • go test ./internal/provider/... ./internal/gateway/... ./internal/runtime -count=1
  • go test ./internal/session ./internal/cli -count=1

提交:86f5a60f test: cover multimodal session asset paths

View job run

Generated with [codeagent](https://github.com/qbox/codeagent)

Co-authored-by: Yumiue <188874804+Yumiue@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web 会话区支持图片上传并加入会话

2 participants