Skip to content

feat(providers/anthropic): add PDF document support for file content blocks#37

Merged
ethanndickson merged 1 commit into
coder_2_33from
cherry/upstream-pr-197-pdf-support
Jun 2, 2026
Merged

feat(providers/anthropic): add PDF document support for file content blocks#37
ethanndickson merged 1 commit into
coder_2_33from
cherry/upstream-pr-197-pdf-support

Conversation

@ethanndickson
Copy link
Copy Markdown
Member

@ethanndickson ethanndickson commented Jun 2, 2026

Cherry-picks upstream charmbracelet#197 (commit 95dcd6e, merged 2026-04-22) onto coder_2_33. No conflicts; the Anthropic adapter on this branch is identical to upstream at the patch sites, so git apply is clean.

Problem

fantasy.FilePart parts with MediaType: "application/pdf" were silently dropped from the Anthropic content list. The case fantasy.ContentTypeFile: switch in providers/anthropic/anthropic.go (around line 925) only handled image/* and text/*, with a // TODO: handle other file types comment. A user message whose only content was a PDF produced zero Anthropic messages and a CallWarningTypeOther "dropping empty user message" warning. The model never saw the document.

In coder/coder, this is reachable from any Coder Agents chat that routes through Anthropic or Bedrock (Bedrock is a thin wrapper around the Anthropic adapter). The PDF gets uploaded, allowlisted, stored in chat_files, resolved by chatprompt.go, and emitted as a fantasy.FilePart{MediaType: "application/pdf", Data: bytes} — then dropped here. Users see Claude reply that it cannot find their PDF and ask for a link or URL instead. Other providers (openai, openrouter, openaicompat, vercel, google, azure) already had working PDF branches.

Fix

Add an application/pdf case that emits anthropic.NewDocumentBlock(anthropic.Base64PDFSourceParam{Data: base64Encoded}), parallel to the existing image branch and honouring cacheControl. Wire cacheControl into the existing text/* document branch too (it was missing). Update hasVisibleUserContent so OfDocument counts as user-visible content; without this a PDF-only user message would still be culled as empty.

The upstream-forked anthropic-sdk-go already exposes Base64PDFSourceParam and NewDocumentBlock, so no SDK change is needed.

Tests

Two new subtests under TestToPrompt_DropsEmptyMessages:

  • should keep user messages with PDF content — sends a PDF-only user message and asserts one message is produced with no warnings.
  • should keep user messages with text document content — same shape for text/markdown.

The existing should drop user messages without visible content subtest is repointed to application/zip (still genuinely unsupported) so it keeps pinning the "no visible content" warning path without locking in the PDF drop.

$ go test ./providers/anthropic/...
ok  charm.land/fantasy/providers/anthropic 0.008s

Two pre-existing test failures on this branch (TestPrepareParams_PreviousResponseID_Validation/allows_tool_messages in providers/openai, and a Go 1.26 build failure in providertests/openai_computer_use_test.go) are unrelated and unchanged by this commit.

Why cherry-pick onto coder_2_33 instead of upstream sync

coder_2_33 is what coder/coder currently pins via its go.mod replace directive (github.com/coder/fantasy v0.0.0-20260514123132-cfca5fd82c5d). A full upstream sync would pull in 74 unrelated commits; this is the smallest viable change to unblock PDF attachments for Anthropic / Bedrock deployments.

Relates to CODAGT-540

…blocks

Cherry-picks charmbracelet#197 (commit 95dcd6e) onto coder_2_33.

Previously, fantasy.FilePart parts with MediaType "application/pdf" were
silently dropped from the Anthropic content list, so a user message whose
only content was a PDF attachment produced zero Anthropic messages and a
'dropping empty user message' warning. The model never saw the document.

Add an explicit application/pdf case that emits a NewDocumentBlock with a
Base64PDFSourceParam, mirroring the image branch's cache-control wiring,
and also wire cache-control into the existing text/* document branch.
Treat OfDocument as user-visible content so PDF-only user messages are
no longer culled as empty.

Co-authored-by: Nic-vdwalt <36562088+Nic-vdwalt@users.noreply.github.com>
@ethanndickson ethanndickson merged commit 7d46e64 into coder_2_33 Jun 2, 2026
2 of 17 checks passed
ethanndickson added a commit to coder/coder that referenced this pull request Jun 2, 2026
Previously, user-uploaded PDFs were silently dropped by fantasy's
Anthropic provider adapter, so Claude (direct or via Bedrock) only saw
the user's text and replied as if no document had been attached. Other
providers (OpenAI, Gemini, OpenRouter, Vercel) were unaffected.

Bumps `coder/fantasy` past
[coder/fantasy#37](coder/fantasy#37)
(cherry-pick of upstream
[charmbracelet/fantasy#197](charmbracelet/fantasy#197)),
which emits an Anthropic `document` content block with a base64 PDF
source for `fantasy.FilePart{MediaType: "application/pdf"}` and counts
`OfDocument` as user-visible so a PDF-only user message is no longer
culled as empty.

Adds a regression test
(`TestModelFromConfig_AnthropicPDFFilePartReachesProvider`) that drives
a `fantasy.FilePart` through the real Anthropic provider against a
`chattest.NewAnthropic` stub and asserts the outbound request contains a
base64 document block. The test was verified to fail on the previous
fantasy pin (the request leaves with zero messages and `Generate`
returns EOF) and pass on the new one.

Manually verified end-to-end with `./scripts/develop.sh`: uploading a
PDF to a Claude-backed Coder Agents chat now lets the model read it.

Closes CODAGT-540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant