v0.25.0
v0.25.0
Released 2026-05-09
Prompts can now carry images and documents. bind_media replaces ${media:name} placeholders with provider-native content blocks β base64-encoded inline data, URLs, or file references β depending on what the provider actually supports. The three spec types that were effectively unused in Python (PromptSpec, AgentSpec, PortableSpec) are gone from the public API.
Breaking changes
PromptSpec, AgentSpec, PortableSpec removed from Python exports
These three classes no longer appear in potato_head.__init__ or in the compiled extension module. If you're importing any of them directly, your code will fail at import time with ImportError.
# These will break:
from potato_head import PromptSpec, AgentSpec, PortableSpecThere is no replacement. These types were internal scaffolding and are not part of the agent or workflow API.
What's new
Media binding in Prompt
Prompts now support image and document content via ${media:name} placeholders in user messages. Write the placeholder as literal text, then call bind_media to replace it with the actual content before sending.
from potato_head import MediaRef, Prompt, Provider
prompt = Prompt(
messages="Describe the chart: ${media:chart}",
provider=Provider.Anthropic,
model="claude-sonnet-4-5",
)
# Bind from bytes
bound = prompt.bind_media("chart", MediaRef.image_bytes("image/png", png_bytes))
# Or from a file path
bound = prompt.bind_media("chart", MediaRef.image_path("/tmp/chart.png"))bind_media returns a new Prompt β the original is unchanged. That means a single template can be reused across multiple bindings without making copies first.
Prompt.media_parameters lists the placeholder names found in the prompt, the same way Prompt.parameters lists text variable names. The two namespaces don't overlap: ${name} and ${media:name} are independent.
MediaRef constructors
| Constructor | Kind | Source |
|---|---|---|
MediaRef.image_url(url, mime_type=None) |
Image | URL forwarded to provider |
MediaRef.image_bytes(mime_type, data) |
Image | Base64-encoded immediately |
MediaRef.image_path(path) |
Image | File read and encoded at call time |
MediaRef.document_url(url, mime_type=None) |
Document | URL forwarded to provider |
MediaRef.document_bytes(mime_type, data) |
Document | Base64-encoded immediately |
MediaRef.document_path(path) |
Document | File read and encoded at call time |
File paths are read eagerly. MIME type is inferred from the extension. Symlinks, directories, and files over 20 MiB are rejected with a RuntimeError. If the path comes from user input, validate it before calling or pass bytes directly.
Provider differences
Not all source types work with all providers. Attempting an unsupported combination raises RuntimeError at bind time, not at request time.
| OpenAI | Anthropic | Gemini/Vertex | |
|---|---|---|---|
| Image URL | β | β | gs:// only, mime_type required |
| Image bytes | β (data URL) | β (base64 block) | β (inline data) |
| Document URL | β | β | gs:// or File API URI |
| Document bytes | β (file content) | β (base64 block) | β (inline data) |
Gemini rejects HTTPS image URLs β it has no way to fetch remote content. Pass bytes instead. Gemini gs:// references require an explicit mime_type argument; without it, binding raises RuntimeError: requires explicit mime_type.
Constraints
Media placeholders are not allowed in system instructions. If a Prompt is constructed with ${media:name} in system_instructions, it raises RuntimeError: media placeholders are not allowed in system messages at construction time β before any bind call.
Each placeholder must be the sole content of its text block after splitting. A placeholder embedded mid-sentence ("here is ${media:x} the chart") is split into separate text and media parts automatically. A placeholder that fails to split cleanly raises RuntimeError: MediaPlaceholderNotIsolated.
Prompt.from_path now accepts str | Path
Previously, from_path required a pathlib.Path object. It now accepts either str or Path.
Workflow.add_task output_type now has a default
output_type was a required positional argument. It now defaults to None, so workflow.add_task(task) is valid.
Upgrading from v0.24.0
- Remove any imports of
PromptSpec,AgentSpec, orPortableSpecfrompotato_head. - No other action required. Media support is additive β existing prompts without
${media:name}placeholders are unaffected.
Contributors
Full changelog: v0.24.0...v0.25.0