Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53e4472948
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| content = _value(message, "content") | ||
| if "url" not in result and isinstance(content, str) and content.startswith("data:image/"): | ||
| result["url"] = content |
There was a problem hiding this comment.
Parse image URLs from list-based chat content
_extract_chat_image_response only extracts image data from message.images or a string message.content, so the chat-completion fallback returns {} when providers send images in OpenAI-style content blocks (for example, message.content as a list containing an image_url block). In that case generate_image/agenerate_image can report a successful tool call with an empty payload even though the model returned an image, which breaks image generation on compatible providers that use block-based content.
Useful? React with 👍 / 👎.
| if isinstance(part, dict) and part.get("type") == "text" | ||
| ] |
There was a problem hiding this comment.
Include input_text blocks when building context summaries
message_summary_text only keeps block text when part.get("type") == "text", but this commit also accepts multimodal input_text blocks in render_input; when context-window summarization runs, those messages are reduced to [multimodal message] and lose the user’s actual text instructions. That can materially degrade retry behavior after context-limit errors because summaries omit key prompt content.
Useful? React with 👍 / 👎.
Summary
This PR ships
experimental/1.1.6and includes the earlier multimodality foundation commit:4d26a1daed951927f3d875f796d9f1bf427475aa(feat: add multimodal support and image generation tools)On top of that base, this PR hardens streaming/tool parsing and multimodal/image payload behavior,
and bumps package metadata to
1.1.6.What changed
message.tool_callsfallback whendelta.tool_callsis empty.}{/ duplicateobject append).
stream=Truefortool-enabled turns.
structured_stream_mode="preview".{"type":"event"}) are JSON-serialized instead of misclassified as multimodal blocks.
data:image/...URLs inToolResult.content.b64_jsoninToolResult.content.ToolResult.data.configuration,desk,worker,events,core-types) for multimodal + toolstreaming + structured preview behavior.
pyproject.toml:1.1.6uv.lockupdated viauv syncCompatibility
This is intended as a non-breaking hardening/QoL release from
1.1.5:misclassification, oversized image tool content).
Verification
ruff,mypy, pytest coverage added for each regression path.177 passed).deepseek/deepseek-chatopenrouter/z-ai/glm-5openrouter/minimax/minimax-m2.5openrouter/google/gemini-3-pro-image-previewopenrouter/google/gemini-3-flash-preview