Add top-level enable_thinking and reasoning_effort to RL config by JannikSt · Pull Request #526 · PrimeIntellect-ai/prime

JannikSt · 2026-04-16T22:23:48Z

Adds the two hosted RL reasoning controls added to the backend in platform#1390 as top-level fields on RLConfig, so users don't have to guess the nested sampling.extra_body.chat_template_kwargs path.

enable_thinking: bool — Qwen3.5 / Nemotron
reasoning_effort: "low" | "medium" | "high" — GPT-OSS
Mutual exclusion validated client-side, mirrored from the backend schema
Forwarded 1:1 in create_run; server resolves model-family + trainer-image-version path

Note

Low Risk
Low risk: adds optional config fields and forwards them to the existing run-creation API, with a small new validation rule that only errors when both options are set.

Overview
Adds two optional, top-level hosted RL reasoning controls to RLConfig—enable_thinking and reasoning_effort—including client-side mutual-exclusion validation and updated rl init template/docs.

Updates RLClient.create_run and the prime rl run flow to display these settings and forward them 1:1 in the /rft/runs payload, and adds tests covering config loading and rejection when both controls are provided.

^{Reviewed by Cursor Bugbot for commit 2dec5c6. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 43cbb1a. Configure here.}

Add top-level enable_thinking and reasoning_effort to RL config

43cbb1a

cursor Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py

Show enable_thinking and reasoning_effort in run config summary

2dec5c6

JannikSt merged commit ed3bc8f into main Apr 23, 2026
19 of 20 checks passed

JannikSt deleted the feature/rl-reasoning-controls branch April 23, 2026 18:00

JannikSt mentioned this pull request Apr 25, 2026

Move enable_thinking and reasoning_effort under [sampling] #558

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add top-level enable_thinking and reasoning_effort to RL config#526

Add top-level enable_thinking and reasoning_effort to RL config#526
JannikSt merged 2 commits intomainfrom
feature/rl-reasoning-controls

JannikSt commented Apr 16, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JannikSt commented Apr 16, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JannikSt commented Apr 16, 2026 •

edited by cursor Bot

Loading