Skip to content

v0.25.0

Choose a tag to compare

@thorrester thorrester released this 09 May 14:59
· 2 commits to main since this release
e05f6a6

v0.25.0

Released 2026-05-09

Prompts can now carry images and documents. bind_media replaces ${media:name} placeholders with provider-native content blocks β€” base64-encoded inline data, URLs, or file references β€” depending on what the provider actually supports. The three spec types that were effectively unused in Python (PromptSpec, AgentSpec, PortableSpec) are gone from the public API.


Breaking changes

PromptSpec, AgentSpec, PortableSpec removed from Python exports

These three classes no longer appear in potato_head.__init__ or in the compiled extension module. If you're importing any of them directly, your code will fail at import time with ImportError.

# These will break:
from potato_head import PromptSpec, AgentSpec, PortableSpec

There is no replacement. These types were internal scaffolding and are not part of the agent or workflow API.


What's new

Media binding in Prompt

Prompts now support image and document content via ${media:name} placeholders in user messages. Write the placeholder as literal text, then call bind_media to replace it with the actual content before sending.

from potato_head import MediaRef, Prompt, Provider

prompt = Prompt(
    messages="Describe the chart: ${media:chart}",
    provider=Provider.Anthropic,
    model="claude-sonnet-4-5",
)

# Bind from bytes
bound = prompt.bind_media("chart", MediaRef.image_bytes("image/png", png_bytes))

# Or from a file path
bound = prompt.bind_media("chart", MediaRef.image_path("/tmp/chart.png"))

bind_media returns a new Prompt β€” the original is unchanged. That means a single template can be reused across multiple bindings without making copies first.

Prompt.media_parameters lists the placeholder names found in the prompt, the same way Prompt.parameters lists text variable names. The two namespaces don't overlap: ${name} and ${media:name} are independent.

MediaRef constructors

Constructor Kind Source
MediaRef.image_url(url, mime_type=None) Image URL forwarded to provider
MediaRef.image_bytes(mime_type, data) Image Base64-encoded immediately
MediaRef.image_path(path) Image File read and encoded at call time
MediaRef.document_url(url, mime_type=None) Document URL forwarded to provider
MediaRef.document_bytes(mime_type, data) Document Base64-encoded immediately
MediaRef.document_path(path) Document File read and encoded at call time

File paths are read eagerly. MIME type is inferred from the extension. Symlinks, directories, and files over 20 MiB are rejected with a RuntimeError. If the path comes from user input, validate it before calling or pass bytes directly.

Provider differences

Not all source types work with all providers. Attempting an unsupported combination raises RuntimeError at bind time, not at request time.

OpenAI Anthropic Gemini/Vertex
Image URL βœ“ βœ“ gs:// only, mime_type required
Image bytes βœ“ (data URL) βœ“ (base64 block) βœ“ (inline data)
Document URL βœ— βœ“ gs:// or File API URI
Document bytes βœ“ (file content) βœ“ (base64 block) βœ“ (inline data)

Gemini rejects HTTPS image URLs β€” it has no way to fetch remote content. Pass bytes instead. Gemini gs:// references require an explicit mime_type argument; without it, binding raises RuntimeError: requires explicit mime_type.

Constraints

Media placeholders are not allowed in system instructions. If a Prompt is constructed with ${media:name} in system_instructions, it raises RuntimeError: media placeholders are not allowed in system messages at construction time β€” before any bind call.

Each placeholder must be the sole content of its text block after splitting. A placeholder embedded mid-sentence ("here is ${media:x} the chart") is split into separate text and media parts automatically. A placeholder that fails to split cleanly raises RuntimeError: MediaPlaceholderNotIsolated.

Prompt.from_path now accepts str | Path

Previously, from_path required a pathlib.Path object. It now accepts either str or Path.

Workflow.add_task output_type now has a default

output_type was a required positional argument. It now defaults to None, so workflow.add_task(task) is valid.


Upgrading from v0.24.0

  1. Remove any imports of PromptSpec, AgentSpec, or PortableSpec from potato_head.
  2. No other action required. Media support is additive β€” existing prompts without ${media:name} placeholders are unaffected.

Contributors

@Thorrester

Full changelog: v0.24.0...v0.25.0