Fix: Data utils and Training CLI by hahuyhoang411 · Pull Request #54 · PrimeIntellect-ai/verifiers

hahuyhoang411 · 2025-04-03T21:55:46Z

When run training with mmlu it throws this error due to double quotes

    question = f"Question: {x["question"]}\n"
                               ^^^^^^^^
SyntaxError: f-string: unmatched '['

CLAassistant · 2025-04-03T21:55:53Z

All committers have signed the CLA.

hahuyhoang411 · 2025-04-03T22:25:43Z

Update the example training CLI by adding --num_processes to handle the error:

[rank4]: RuntimeError: CUDA error: invalid device ordinal
[rank4]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank4]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank4]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

willccbb · 2025-05-15T19:31:46Z

ah sorry, missed this, but has been fixed in recent pushes

rlm/#54 dropped the randomized-threshold form from _parse_summarize_at_tokens on the engine side, so the engine now rejects "lo,hi" strings in RLM_SUMMARIZE_AT_TOKENS. Mirror that here: the harness kwarg is int | None, the formatter emits only "N", and the docstring no longer promises the tuple form. Fails at harness-build time instead of inside the sandbox if a caller still passes a tuple.

* rlm_harness: add summarize_at_tokens, drop rlm_max_turns_in_context Plumbs the new rlm auto-compaction knob from kwarg -> env var. Accepts an int for a fixed threshold or a (lo, hi) pair to draw a uniform threshold per rollout/compaction; None leaves RLM_SUMMARIZE_AT_TOKENS unset so rlm disables auto-compaction. Invalid shapes fail at harness-build time instead of deep inside the sandbox. rlm_max_turns_in_context and RLM_MAX_TURNS_IN_CONTEXT are gone upstream (rlm now uses token-based compaction instead of turn caps). Default rlm_tools drops "summarize" to match rlm's new tool set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ruff * rlm_harness: summarize_at_tokens is int-only, drop (lo, hi) form rlm/#54 dropped the randomized-threshold form from _parse_summarize_at_tokens on the engine side, so the engine now rejects "lo,hi" strings in RLM_SUMMARIZE_AT_TOKENS. Mirror that here: the harness kwarg is int | None, the formatter emits only "N", and the docstring no longer promises the tuple form. Fails at harness-build time instead of inside the sandbox if a caller still passes a tuple. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Sami Jaghouar <sami.jaghouar@hotmail.fr>

fix double quotes

3b6543f

hahuyhoang411 changed the title ~~Fix double quotes~~ Fix: Data utils and Training CLI Apr 3, 2025

update cli

8c90784

willccbb closed this May 15, 2025

S1ro1 mentioned this pull request May 21, 2026

Clean routed experts response path #1433

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Data utils and Training CLI#54

Fix: Data utils and Training CLI#54
hahuyhoang411 wants to merge 2 commits into
PrimeIntellect-ai:mainfrom
hahuyhoang411:fix

hahuyhoang411 commented Apr 3, 2025

Uh oh!

CLAassistant commented Apr 3, 2025 •

edited

Loading

Uh oh!

hahuyhoang411 commented Apr 3, 2025

Uh oh!

willccbb commented May 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hahuyhoang411 commented Apr 3, 2025

Uh oh!

CLAassistant commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hahuyhoang411 commented Apr 3, 2025

Uh oh!

willccbb commented May 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Apr 3, 2025 •

edited

Loading