refactor(qwen3.5): hard-code enable_thinking default per model by hallerite · Pull Request #71 · PrimeIntellect-ai/renderers

hallerite · 2026-05-27T17:30:44Z

Why

Qwen35Renderer resolved its enable_thinking default by probing the tokenizer's chat template at construction:

if cfg.enable_thinking is None:
    cfg = cfg.model_copy(update={"enable_thinking": _detect_enable_thinking_default(tokenizer)})
# _detect_enable_thinking_default → tokenizer.apply_chat_template(...)

Since Qwen35RendererConfig.enable_thinking defaults to None, this fired on a normal Qwen35Renderer(tok) — calling apply_chat_template on the hot path. That pulls transformers into construction and breaks bring-your-own-tokenizer use (a raw tokenizers.Tokenizer has no apply_chat_template), which is exactly the dependency we're trying to shed (issue #31).

What changed

Replaced the probe with a hard-coded table keyed by model name, enumerating every checkpoint routed to the qwen3.5 / qwen3.6 renderer:

_ENABLE_THINKING_DEFAULTS = {
    "Qwen/Qwen3.5-0.8B": False,   # small sizes flip polarity → thinking off
    "Qwen/Qwen3.5-2B":   False,
    "Qwen/Qwen3.5-4B":   True,    # big sizes default thinking on
    "Qwen/Qwen3.5-9B":   True,
    "Qwen/Qwen3.5-35B-A3B":   True,
    "Qwen/Qwen3.5-122B-A10B": True,
    "Qwen/Qwen3.5-397B-A17B": True,
    "Qwen/Qwen3.6-35B-A3B":   True,
}

Unknown / fine-tuned checkpoints fall back to True (the big-model default, matching the old probe's failure fallback); pass an explicit enable_thinking= for a small-size fine-tune that needs False.

Validation

The values are exactly what the probe returned — already pinned by tests/test_qwen35_size_coverage.py::test_qwen35_enable_thinking_polarity_default and the byte-parity barrage against each size's own apply_chat_template. Full size-coverage suite: 37 passed (all 7 sizes, with/without gen prompt).
New guard test test_construction_does_not_call_apply_chat_template: builds a Qwen35Renderer with a stub tokenizer whose apply_chat_template raises, and asserts construction succeeds + resolves the right default.
ruff + ty clean.

🤖 Generated with Claude Code

Note

Low Risk
Behavior is unchanged for mapped models per existing parity tests; risk is limited to unknown fine-tunes that relied on probe vs table fallback (still defaults to True).

Overview
Qwen35Renderer no longer calls apply_chat_template at construction to infer enable_thinking. When config leaves it None, defaults now come from _ENABLE_THINKING_DEFAULTS keyed by tokenizer.name_or_path (0.8B/2B → False, larger Qwen3.5 sizes and Qwen/Qwen3.6-35B-A3B → True, unknown checkpoints → True).

Docs and tests/test_qwen35_size_coverage.py were updated to describe hard-coded polarity instead of auto-detection, and test_construction_does_not_call_apply_chat_template asserts a stub tokenizer without chat-template support can still be constructed.

^{Reviewed by Cursor Bugbot for commit 45f02d6. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Replace dynamic tokenizer probing with a static lookup for `Qwen35Renderer` `enable_thinking` defaults

Removes _detect_enable_thinking_default, which called tokenizer.apply_chat_template to infer the enable_thinking default at construction time.
Adds _ENABLE_THINKING_DEFAULTS, a module-level dict in qwen35.py mapping known Qwen3.5/3.6 model names to their correct enable_thinking polarity.
The new _default_enable_thinking helper looks up tokenizer.name_or_path in this table, falling back to True for unknown or fine-tuned checkpoints.
A new test verifies that Qwen35Renderer construction no longer calls apply_chat_template at all.

^{Macroscope summarized 45f02d6.}

`Qwen35Renderer` previously probed the tokenizer's chat template at construction (`apply_chat_template`) to learn each checkpoint's `enable_thinking` polarity. Because the config default is `None`, that probe ran on a plain `Qwen35Renderer(tok)` — pulling `transformers` onto the hot path and breaking bring-your-own-tokenizer use (a raw `tokenizers.Tokenizer` has no `apply_chat_template`). Replace it with a hard-coded `_ENABLE_THINKING_DEFAULTS` table keyed by model name, covering every checkpoint routed to the `qwen3.5` / `qwen3.6` renderer (small 0.8B/2B → False, the rest → True). Unknown / fine-tuned checkpoints fall back to `True` (the big-model default); pass an explicit `enable_thinking=` to override. Values are the same ones the probe returned — pinned by the existing polarity + byte-parity tests in `tests/test_qwen35_size_coverage.py`. Adds a guard test asserting construction never calls `apply_chat_template`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

macroscopeapp · 2026-05-27T17:37:22Z

Approvability

Verdict: Approved

This refactor replaces dynamic auto-detection of enable_thinking defaults with a hard-coded lookup table, preserving identical behavior for all known Qwen3.5 models while avoiding a transformers dependency at construction time. The change is well-tested and maintains byte-parity with the previous implementation.

^{You can customize Macroscope's approvability policy. Learn more.}

…e from Tokenizer Brings in #68 (examples), #69 (harmony floor), #71 (qwen3.5 hard-coded enable_thinking). The only qwen35.py conflict is resolved by keeping #71's hard-coded `_ENABLE_THINKING_DEFAULTS` table (no `apply_chat_template` probe) on top of #31's `Tokenizer`/`Processor` type hints. Now that #71 removed the last hand-coded-renderer call to `apply_chat_template`, drop it from the `Tokenizer` protocol so a plain `tokenizers.Tokenizer` wrapper satisfies it. `apply_chat_template` moves to a new `ChatTemplateTokenizer(Tokenizer, Protocol)` subtype, required only by `DefaultRenderer` (the generic chat-template fallback).

macroscopeapp Bot approved these changes May 27, 2026

View reviewed changes

hallerite merged commit e729baa into main May 27, 2026
11 checks passed

hallerite deleted the qwen35-hardcode-thinking branch May 27, 2026 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(qwen3.5): hard-code enable_thinking default per model#71

refactor(qwen3.5): hard-code enable_thinking default per model#71
hallerite merged 1 commit into
mainfrom
qwen35-hardcode-thinking

hallerite commented May 27, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

macroscopeapp Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 27, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What changed

Validation

Replace dynamic tokenizer probing with a static lookup for Qwen35Renderer enable_thinking defaults

Uh oh!

macroscopeapp Bot commented May 27, 2026

Approvability

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hallerite commented May 27, 2026 •

edited by macroscopeapp Bot

Loading

Replace dynamic tokenizer probing with a static lookup for `Qwen35Renderer` `enable_thinking` defaults