fix(providers): parse self-hosted context metadata#1227
Merged
Aaronontheweb merged 2 commits intoMay 29, 2026
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original Problem
OpenAI-compatible self-hosted providers could successfully list models, but Netclaw did not surface the model context window during discovery. In the CLI this showed up as a missing context window, and the runtime could fall back to the default 32K context value even when the backend reported a much larger effective context.
The issue was caused by metadata shape differences between self-hosted backends:
max_model_lenon/v1/modelsentries.meta.n_ctxon/v1/modelsentries./propsresponses can report runtime context asdefault_generation_settings.n_ctxinstead of the olderdefault_generation_settings.params.n_ctxshape.Netclaw already had separate backend strategies for vLLM and llama.cpp, but the llama.cpp strategy only understood the older
n_ctx_trainand nested/propsshape.n_ctx_trainis the training context, not necessarily the runtime-effective context when the server splits capacity across slots.What Changed
max_model_len.n_ctxand fall back ton_ctx_trainonly when runtime context is unavailable./propsparsing supports both current top-leveldefault_generation_settings.n_ctxand the older nestedparams.n_ctxshape.Tests
dotnet test "src/Netclaw.Daemon.Tests/Netclaw.Daemon.Tests.csproj" --filter "FullyQualifiedName~OpenAiCompatibleCapabilityResolverTests|FullyQualifiedName~LlamaCppBackendStrategyTests"dotnet test "src/Netclaw.Configuration.Tests/Netclaw.Configuration.Tests.csproj" --filter "FullyQualifiedName~ProbeHelpersTests"pwsh "./scripts/Add-FileHeaders.ps1" -Verifygit diff --check