Add extra_body to custom providers + multi-batch label_communities for 16k-context models#1197
Merged
safishamsi merged 1 commit intoJun 8, 2026
Conversation
Two related fixes that together unlock self-hosted reasoning models
(Qwen3, Llama 3.1 8B-Instruct, etc.) for the `graphify label` workflow:
1. `cfg.get("extra_body")` from `providers.json` is now propagated to
the OpenAI-compatible client at both the extraction code path
(`_call_openai_compat` via `extract_files_direct`) and the labeling
code path (`_call_llm`). Without this, a custom provider pointing at
a vLLM endpoint serving Qwen3 has no way to set
`chat_template_kwargs.enable_thinking=false`, so the model emits a
chain-of-thought preamble instead of the JSON the parser expects and
the whole call rejects. An explicit `extra_body` also bypasses the
ollama `num_ctx` auto-derive — a provider that opts in knows its own
request shape.
2. `label_communities` now batches communities in chunks of 100
(configurable via `batch_size=`) instead of a single call hard-capped
at 200. The 200-cap × 12 sampled node labels routinely overflowed
the 16k context window of self-hosted reasoning models, dropping the
entire pass to placeholders even on small graphs. Default
`max_communities` is now `None` (label every community); explicit
integer caps still work for back-compat. Partial batch failures no
longer kill the whole pass — successful batches still contribute
real labels, only the failed batch's cids stay as placeholders.
Tested against 5 local repos (200–525 communities each) on vLLM
serving Qwen3.6-27B INT4-AutoRound on a 24 GB RTX 3090. Coverage went
from 0–44% (when the call returned at all) to 99,8% after these
changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes that together enable self-hosted reasoning models (Qwen3, Llama 3.1 8B-Instruct, etc.) to drive
graphify label.cfg.get(\"extra_body\")fromproviders.jsonreaches the OpenAI-compatible client — at both the extraction (_call_openai_compatviaextract_files_direct) and the labeling (_call_llm) code paths. Without this, a custom provider pointing at a vLLM endpoint serving Qwen3 had no way to setchat_template_kwargs.enable_thinking=false. The model then emits a chain-of-thought preamble instead of the JSON the parser expects, and the call fails. An explicitextra_bodyalso bypasses the ollamanum_ctxauto-derive — a custom provider that opts in is declaring it owns the request shape.label_communitiesbatches communities in chunks of 100 (configurable viabatch_size=) instead of a single call hard-capped at 200. The old 200-cap ×_LABEL_TOP_K=12sampled node labels routinely overflowed the 16k context window of self-hosted reasoning models, dropping the entire pass to placeholders even on graphs with only a couple hundred communities. The defaultmax_communitiesis nowNone(label every community); explicit integer caps still work for back-compat. Partial batch failures no longer kill the whole pass — successful batches still contribute real labels, only the failed batch's cids stay asCommunity N.Why one PR for two changes: they're separable but neither alone unlocks the self-hosted-Qwen3 workflow that motivated them, and a reviewer who pulls this down to verify will want both in one shot. Squash-merging the two as a single feature also keeps the changelog readable.
Tested
Locally against 5 repos (200–525 communities each) using vLLM serving
Qwen3.6-27B INT4-AutoRoundon a 24 GB RTX 3090. Label coverage went from 0–44% (when the call returned at all) to 99.8% after both changes.CI suite (
uv run pytest) — added 7 new tests:tests/test_labeling.py:test_label_communities_batches_when_over_batch_size— 250 communities ÷ batch_size 100 → 3 calls of (100, 100, 50), all real namestest_label_communities_partial_batch_failure_keeps_successful_batches— middle batch raises; surviving batches still produce real namestest_label_communities_all_batches_fail_raises— propagates sogenerate_community_labelscan degradetest_label_communities_max_communities_caps_total— explicitmax_communities=40still caps total cids senttests/test_llm_backends.py:test_call_openai_compat_uses_explicit_extra_bodytest_call_openai_compat_extra_body_wins_over_moonshot_defaulttest_call_openai_compat_explicit_extra_body_skips_ollama_auto_deriveExisting labeling and backend tests are untouched and continue to pass — back-compat verified.
Notes for reviewers
_LABEL_MAX_COMMUNITIES = 200is kept as a module-level constant (now reframed as a legacy soft-cap for callers that pin it explicitly). The default behavior change is to label every community across N batches rather than truncating at 200 in a single call.max_communitiesflipped from200→None. Callers passing nothing previously got the first 200 communities labeled in one shot; they now get all communities labeled across however many batches it takes. The old behavior is recoverable by passingmax_communities=200explicitly.generate_community_labelswarning style) and the loop continues. Only if every batch fails do we re-raise — keeps the "raises on backend failure" contract intact for the no-labels-written case.Test plan
uv run pytest tests/test_labeling.py tests/test_llm_backends.py— 48 passuv run ruff checkon touched files — cleanuv run pytest— all green except a pre-existingtest_cpp_preprocess_passes_absolute_pathfailure on Windows (the test asserts the path starts with/, fails forC:\...; unrelated to this PR — same failure on stockupstream/v8)