-
Notifications
You must be signed in to change notification settings - Fork 13.9k
server : add Anthropic Messages API support #17570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : add Anthropic Messages API support #17570
Conversation
…se64_with_multimodal_model in test_anthropic_api.py
…response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream()
|
New PR to allow maintainers to edit. |
|
The RISCV test is getting
I'm guessing it's not related to the PR? Any way to retry? |
|
This PR can be merge when server CI passes. Other CI are not important. |
|
I stumbled across this as it hit conflicts with my PR. I am curious. What models does this work with? With sufficient hardware is this capable of beating Claude cloud models? |
|
Technically it works with pretty much any model but to get anywhere near Claude Sonnet you'd probably need a large, agentic model like MiniMax M2, Kimi K2, Qwen3 Coder 480B-A35B, etc. That being said, I've had decent results for simple tasks with Qwen3 Coder 30B-A3B and gpt-oss-20b on a single 4090. In my very subjective experience, the same models tend to perform a lot better with the Claude Code CLI app than with alternatives such as Open Code or gemini-cli and its clones, like Qwen3-Coder (the cli app). |
|
Interesting... If you want to take a quick peek, I fixed the conflicts here: although they weren't major conflicts, it was just moving code from one place to another. |
Summary
This PR adds Anthropic Messages API compatibility to llama-server. The implementation converts Anthropic's format to OpenAI-compatible internal format, reusing existing inference pipeline.
Motivation
Features Implemented
Endpoints:
Functionality:
Testing