Skip to content

Move enable_thinking and reasoning_effort under [sampling]#558

Merged
JannikSt merged 1 commit intomainfrom
improvement/rl-reasoning-under-sampling
Apr 25, 2026
Merged

Move enable_thinking and reasoning_effort under [sampling]#558
JannikSt merged 1 commit intomainfrom
improvement/rl-reasoning-under-sampling

Conversation

@JannikSt
Copy link
Copy Markdown
Member

@JannikSt JannikSt commented Apr 25, 2026

Follow-up to #526. These belong under [sampling] alongside other sampling controls, not at top level.

  • RLConfig.enable_thinking / reasoning_effortSamplingConfig
  • Mutual-exclusion validator moves with them
  • Run summary shows them in the Sampling section
  • Template + tests updated

Note

Medium Risk
Medium risk because it changes the RL TOML config schema: existing configs using top-level enable_thinking/reasoning_effort will now fail validation and must be moved under [sampling]. Runtime behavior is otherwise unchanged aside from where values are read and displayed.

Overview
Moves hosted RL reasoning controls (enable_thinking, reasoning_effort) from top-level RLConfig into SamplingConfig under [sampling], including relocating the mutual-exclusion validator.

Updates the generated config template, CLI run summary output (now shown in the Sampling section), and the create_run API payload mapping to read these values from cfg.sampling. Tests are updated to require [sampling] usage and add coverage that top-level reasoning_effort is rejected.

Reviewed by Cursor Bugbot for commit 4cc1cda. Bugbot is set up for automated code reviews on this repo. Configure here.

@JannikSt JannikSt merged commit a1500b6 into main Apr 25, 2026
11 of 12 checks passed
@JannikSt JannikSt deleted the improvement/rl-reasoning-under-sampling branch April 25, 2026 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant