server : add Anthropic Messages API support #17570

noname22 · 2025-11-28T09:09:16Z

Summary

This PR adds Anthropic Messages API compatibility to llama-server. The implementation converts Anthropic's format to OpenAI-compatible internal format, reusing existing inference pipeline.

Motivation

Enables llama.cpp to serve as a local/self-hosted alternative to Anthropic's Claude API
Allows Claude Code and other Anthropic-compatible clients to work with llama-server

Features Implemented

Endpoints:

POST /v1/messages - Chat completions with streaming support
POST /v1/messages/count_tokens - Token counting for prompts

Functionality:

Streaming with proper Anthropic SSE event types (message_start, content_block_delta, etc.)
Tool use (function calling) with tool_use/tool_result content blocks
Vision support with image content blocks (base64 and URL)
System prompts and multi-turn conversations
Extended thinking parameter support

Testing

Tests in test_anthropic_api.py
Tests cover: basic messages, streaming, tools, vision, token counting, parameters, error handling, content block indices

…se64_with_multimodal_model in test_anthropic_api.py

…response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream()

noname22 · 2025-11-28T09:10:35Z

New PR to allow maintainers to edit.
Old PR here: #17425

noname22 · 2025-11-28T11:08:34Z

The RISCV test is getting

The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

I'm guessing it's not related to the PR? Any way to retry?

ngxson · 2025-11-28T11:15:04Z

This PR can be merge when server CI passes. Other CI are not important.

ericcurtin · 2025-11-28T12:20:49Z

I stumbled across this as it hit conflicts with my PR. I am curious. What models does this work with? With sufficient hardware is this capable of beating Claude cloud models?

noname22 · 2025-11-28T12:36:27Z

Technically it works with pretty much any model but to get anywhere near Claude Sonnet you'd probably need a large, agentic model like MiniMax M2, Kimi K2, Qwen3 Coder 480B-A35B, etc.

That being said, I've had decent results for simple tasks with Qwen3 Coder 30B-A3B and gpt-oss-20b on a single 4090.

In my very subjective experience, the same models tend to perform a lot better with the Claude Code CLI app than with alternatives such as Open Code or gemini-cli and its clones, like Qwen3-Coder (the cli app).

ericcurtin · 2025-11-28T12:51:27Z

Interesting... If you want to take a quick peek, I fixed the conflicts here:

#17554

although they weren't major conflicts, it was just moving code from one place to another.

noname22 added 6 commits November 25, 2025 17:05

server : add Anthropic Messages API support

aa6192d

remove -@pytest.mark.slow from tool calling/jinja tests

f7d463d

server : remove unused code and slow/skip on test_anthropic_vision_ba…

32b65f0

…se64_with_multimodal_model in test_anthropic_api.py

server : removed redundant n field logic in anthropic_params_from_json

c922b4a

server : use single error object instead of error_array in streaming …

f388e35

…response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream()

server : refactor Anthropic API to use OAI conversion

728d4ec

noname22 requested review from ggerganov and ngxson as code owners November 28, 2025 09:09

noname22 mentioned this pull request Nov 28, 2025

server : add Anthropic Messages API support #17425

Closed

ngxson added 3 commits November 28, 2025 11:21

make sure basic test always go first

3323564

clean up

b13b41f

clean up api key check, add test

1381ded

ngxson approved these changes Nov 28, 2025

View reviewed changes

github-actions bot added examples python python script changes server labels Nov 28, 2025

ngxson merged commit ddf9f94 into ggml-org:master Nov 28, 2025
65 of 69 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : add Anthropic Messages API support #17570

server : add Anthropic Messages API support #17570

noname22 commented Nov 28, 2025

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

ngxson commented Nov 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

ericcurtin commented Nov 28, 2025 •

edited

Loading

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

ericcurtin commented Nov 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

server : add Anthropic Messages API support #17570

server : add Anthropic Messages API support #17570

Conversation

noname22 commented Nov 28, 2025

Summary

Motivation

Features Implemented

Testing

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

ngxson commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ericcurtin commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noname22 commented Nov 28, 2025

Uh oh!

ericcurtin commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ngxson commented Nov 28, 2025 •

edited

Loading

ericcurtin commented Nov 28, 2025 •

edited

Loading

ericcurtin commented Nov 28, 2025 •

edited

Loading