Question
When building a plugin that delegates image analysis to a sub-agent (e.g., a "multimodal-looker"), we need the agent to access the uploaded image data. However, it's unclear whether the plugin protocol supports passing image bytes or local temporary file paths to an agent's context.
Specifically:
-
Image bytes in tool results: If a tool returns binary image data (e.g., from a user upload stored server-side), can the agent/model actually consume those bytes for multimodal analysis? Or are tool results limited to text/JSON?
-
Local temp file paths: If uploaded images are written to a temporary file on the host machine, can the agent read from that local path? The agent runs in the same process, but it's not clear whether the security model permits arbitrary local file reads from within a plugin context.
-
Intended pattern: What is the recommended way for a plugin to give an agent access to user-uploaded images for visual analysis? Should we:
- Embed base64-encoded image data in the tool result?
- Pass a file URI and rely on the agent's built-in file-read capability?
- Use a different mechanism entirely?
Context
We are building a plugin that uses multimodal-looker to analyze uploaded images. The images are written to temp files and the path is passed to the agent, but the agent does not appear to read them. We have also tried forwarding raw bytes through the plugin protocol without success.
We would like to understand whether this is a current limitation, a misconfiguration on our side, or an intentionally unsupported path.
Question
When building a plugin that delegates image analysis to a sub-agent (e.g., a "multimodal-looker"), we need the agent to access the uploaded image data. However, it's unclear whether the plugin protocol supports passing image bytes or local temporary file paths to an agent's context.
Specifically:
Image bytes in tool results: If a tool returns binary image data (e.g., from a user upload stored server-side), can the agent/model actually consume those bytes for multimodal analysis? Or are tool results limited to text/JSON?
Local temp file paths: If uploaded images are written to a temporary file on the host machine, can the agent read from that local path? The agent runs in the same process, but it's not clear whether the security model permits arbitrary local file reads from within a plugin context.
Intended pattern: What is the recommended way for a plugin to give an agent access to user-uploaded images for visual analysis? Should we:
Context
We are building a plugin that uses
multimodal-lookerto analyze uploaded images. The images are written to temp files and the path is passed to the agent, but the agent does not appear to read them. We have also tried forwarding raw bytes through the plugin protocol without success.We would like to understand whether this is a current limitation, a misconfiguration on our side, or an intentionally unsupported path.