-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Description
Name and Version
root@llm:/app# /app/llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
load_backend: loaded CUDA backend from /app/libggml-cuda.so
load_backend: loaded CPU backend from /app/libggml-cpu-icelake.so
version: 6730 (e60f01d)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CUDA
Hardware
Nvidia 5090, Nvidia 4060
Models
openai/gpt-oss-20b, mistralai/magistral-small-2509
Problem description & steps to reproduce
Created an issue at modelcontextprotocol/typescript-sdk originally, but @ochafik figured it's more of a llama.cpp thing.
Here's the original comment:
I was using zod4 with "@modelcontextprotocol/sdk": "npm:@socotra/modelcontextprotocol-sdk"
and everything was fine during the development process when I served my model with LMStudio.
But as soon as I tried serving the model with llama.cpp server I got integer parameters truncation to a single digit. Tried different models - same result.
server.registerTool('test',
{
title: 'Get test record by ID',
description: 'Get test record by ID',
inputSchema: { testId: z.number().int().positive().describe('The ID of the test record') }
},
({ testId }) => {
return {
content: [{ type: 'text', text: 'You requested test record ID: ' + testId }]
}
}
)
Leads to:
{
type: 'reasoning',
text: "The user wants to call the function test with testId: 654321. We'll need to use the function."
}
{
type: 'tool-call',
toolCallId: 'ADR3LRcFhzirsDqSvlfKPxlYmYLh3fKB',
toolName: 'test',
input: { testId: 6 },
providerExecuted: undefined,
providerMetadata: undefined,
dynamic: true
}
Again: same code, same model, but served with LMStudio - no issues. The tool gets called with proper argument 654321.
If I declare 'testId' to be a string - both LMStudio and llama.cpp work fine.
When I switch to "zod": "3"
and "@modelcontextprotocol/sdk": "^1.20.0",
llama.cpp server works fine.
Here's the the simplest code to reproduce the issue:
package.json
test_llm.js
test_mcp.js
Hope, this helps!
First Bad Commit
No response
Relevant log output
{
type: 'reasoning',
text: "The user wants to call the function test with testId: 654321. We'll need to use the function."
}
{
type: 'tool-call',
toolCallId: 'ADR3LRcFhzirsDqSvlfKPxlYmYLh3fKB',
toolName: 'test',
input: { testId: 6 },
providerExecuted: undefined,
providerMetadata: undefined,
dynamic: true
}