Skip to content

Add extra_body to custom providers + multi-batch label_communities for 16k-context models#1197

Merged
safishamsi merged 1 commit into
Graphify-Labs:v8from
EirikWolf:feat/extra-body-and-multi-batch-label
Jun 8, 2026
Merged

Add extra_body to custom providers + multi-batch label_communities for 16k-context models#1197
safishamsi merged 1 commit into
Graphify-Labs:v8from
EirikWolf:feat/extra-body-and-multi-batch-label

Conversation

@EirikWolf

Copy link
Copy Markdown
Contributor

Summary

Two related fixes that together enable self-hosted reasoning models (Qwen3, Llama 3.1 8B-Instruct, etc.) to drive graphify label.

  1. cfg.get(\"extra_body\") from providers.json reaches the OpenAI-compatible client — at both the extraction (_call_openai_compat via extract_files_direct) and the labeling (_call_llm) code paths. Without this, a custom provider pointing at a vLLM endpoint serving Qwen3 had no way to set chat_template_kwargs.enable_thinking=false. The model then emits a chain-of-thought preamble instead of the JSON the parser expects, and the call fails. An explicit extra_body also bypasses the ollama num_ctx auto-derive — a custom provider that opts in is declaring it owns the request shape.

  2. label_communities batches communities in chunks of 100 (configurable via batch_size=) instead of a single call hard-capped at 200. The old 200-cap × _LABEL_TOP_K=12 sampled node labels routinely overflowed the 16k context window of self-hosted reasoning models, dropping the entire pass to placeholders even on graphs with only a couple hundred communities. The default max_communities is now None (label every community); explicit integer caps still work for back-compat. Partial batch failures no longer kill the whole pass — successful batches still contribute real labels, only the failed batch's cids stay as Community N.

Why one PR for two changes: they're separable but neither alone unlocks the self-hosted-Qwen3 workflow that motivated them, and a reviewer who pulls this down to verify will want both in one shot. Squash-merging the two as a single feature also keeps the changelog readable.

Tested

Locally against 5 repos (200–525 communities each) using vLLM serving Qwen3.6-27B INT4-AutoRound on a 24 GB RTX 3090. Label coverage went from 0–44% (when the call returned at all) to 99.8% after both changes.

CI suite (uv run pytest) — added 7 new tests:

tests/test_labeling.py:

  • test_label_communities_batches_when_over_batch_size — 250 communities ÷ batch_size 100 → 3 calls of (100, 100, 50), all real names
  • test_label_communities_partial_batch_failure_keeps_successful_batches — middle batch raises; surviving batches still produce real names
  • test_label_communities_all_batches_fail_raises — propagates so generate_community_labels can degrade
  • test_label_communities_max_communities_caps_total — explicit max_communities=40 still caps total cids sent

tests/test_llm_backends.py:

  • test_call_openai_compat_uses_explicit_extra_body
  • test_call_openai_compat_extra_body_wins_over_moonshot_default
  • test_call_openai_compat_explicit_extra_body_skips_ollama_auto_derive

Existing labeling and backend tests are untouched and continue to pass — back-compat verified.

Notes for reviewers

  • _LABEL_MAX_COMMUNITIES = 200 is kept as a module-level constant (now reframed as a legacy soft-cap for callers that pin it explicitly). The default behavior change is to label every community across N batches rather than truncating at 200 in a single call.
  • Default max_communities flipped from 200None. Callers passing nothing previously got the first 200 communities labeled in one shot; they now get all communities labeled across however many batches it takes. The old behavior is recoverable by passing max_communities=200 explicitly.
  • Partial-batch error policy: errors print to stderr (matches existing generate_community_labels warning style) and the loop continues. Only if every batch fails do we re-raise — keeps the "raises on backend failure" contract intact for the no-labels-written case.
  • No new dependencies, no schema changes, no skill regeneration needed.

Test plan

  • uv run pytest tests/test_labeling.py tests/test_llm_backends.py — 48 pass
  • uv run ruff check on touched files — clean
  • Full uv run pytest — all green except a pre-existing test_cpp_preprocess_passes_absolute_path failure on Windows (the test asserts the path starts with /, fails for C:\...; unrelated to this PR — same failure on stock upstream/v8)
  • End-to-end against vLLM Qwen3.6-27B: 1 694 / 1 697 = 99.8% real LLM-generated names across 5 repos

Two related fixes that together unlock self-hosted reasoning models
(Qwen3, Llama 3.1 8B-Instruct, etc.) for the `graphify label` workflow:

1. `cfg.get("extra_body")` from `providers.json` is now propagated to
   the OpenAI-compatible client at both the extraction code path
   (`_call_openai_compat` via `extract_files_direct`) and the labeling
   code path (`_call_llm`). Without this, a custom provider pointing at
   a vLLM endpoint serving Qwen3 has no way to set
   `chat_template_kwargs.enable_thinking=false`, so the model emits a
   chain-of-thought preamble instead of the JSON the parser expects and
   the whole call rejects. An explicit `extra_body` also bypasses the
   ollama `num_ctx` auto-derive — a provider that opts in knows its own
   request shape.

2. `label_communities` now batches communities in chunks of 100
   (configurable via `batch_size=`) instead of a single call hard-capped
   at 200. The 200-cap × 12 sampled node labels routinely overflowed
   the 16k context window of self-hosted reasoning models, dropping the
   entire pass to placeholders even on small graphs. Default
   `max_communities` is now `None` (label every community); explicit
   integer caps still work for back-compat. Partial batch failures no
   longer kill the whole pass — successful batches still contribute
   real labels, only the failed batch's cids stay as placeholders.

Tested against 5 local repos (200–525 communities each) on vLLM
serving Qwen3.6-27B INT4-AutoRound on a 24 GB RTX 3090. Coverage went
from 0–44% (when the call returned at all) to 99,8% after these
changes.
@safishamsi safishamsi merged commit 7477b46 into Graphify-Labs:v8 Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants