Issues with truncation #1365
-
|
I use The problems I have is truncation and invalid json being returned. For example: Is there any way I can debug any of this to see if there are any settings I need to change to optimise my calling of the API with My command is aliased so when I call #!/usr/bin/env bash
OLLAMA_BASE_URL=https://openrouter.ai/api/v1 \
OLLAMA_API_KEY="${OPENROUTER_API_KEY}" \
OLLAMA_MODEL="${GRAPHIFY_OR_MODEL:-deepseek/deepseek-v4-flash}" \
exec /Users/jorpo/.pyenv/shims/graphify "$@"Any suggestions would be greatly appreciated :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Thanks for the detailed report - the logs made this easy to pin down. What's happeningThe truncation is on the output side, and graphify is actually recovering from it: when a chunk's JSON comes back truncated/unparseable, it bisects the chunk ( But there was a real bug underneath it. The OpenAI-compatible backends ( FixedJust pushed What you can do right now (before the release)
One caveat
Hope that unblocks you - let us know how it goes! |
Beta Was this translation helpful? Give feedback.
Thanks for the detailed report - the logs made this easy to pin down.
What's happening
The truncation is on the output side, and graphify is actually recovering from it: when a chunk's JSON comes back truncated/unparseable, it bisects the chunk (
splitting into halves of 2 and 2) and re-extracts the smaller halves. So those warnings are noisy but not data loss - the affected files get re-extracted on smaller inputs.But there was a real bug underneath it. The OpenAI-compatible backends (
ollama,openai,deepseek,kimi) define their output cap asmax_tokens: 16384in the backend config, but the request dispatch only read amax_completion_tokenskey - which only thegeminiconfig defines. So …