Skip to content

Fix: Data utils and Training CLI#54

Closed
hahuyhoang411 wants to merge 2 commits into
PrimeIntellect-ai:mainfrom
hahuyhoang411:fix
Closed

Fix: Data utils and Training CLI#54
hahuyhoang411 wants to merge 2 commits into
PrimeIntellect-ai:mainfrom
hahuyhoang411:fix

Conversation

@hahuyhoang411
Copy link
Copy Markdown

When run training with mmlu it throws this error due to double quotes

    question = f"Question: {x["question"]}\n"
                               ^^^^^^^^
SyntaxError: f-string: unmatched '['

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 3, 2025

CLA assistant check
All committers have signed the CLA.

@hahuyhoang411 hahuyhoang411 changed the title Fix double quotes Fix: Data utils and Training CLI Apr 3, 2025
@hahuyhoang411
Copy link
Copy Markdown
Author

Update the example training CLI by adding --num_processes to handle the error:

[rank4]: RuntimeError: CUDA error: invalid device ordinal
[rank4]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank4]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank4]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

@willccbb
Copy link
Copy Markdown
Member

ah sorry, missed this, but has been fixed in recent pushes

@willccbb willccbb closed this May 15, 2025
samsja added a commit that referenced this pull request Apr 23, 2026
rlm/#54 dropped the randomized-threshold form from
_parse_summarize_at_tokens on the engine side, so the engine now
rejects "lo,hi" strings in RLM_SUMMARIZE_AT_TOKENS. Mirror that here:
the harness kwarg is int | None, the formatter emits only "N", and the
docstring no longer promises the tuple form.

Fails at harness-build time instead of inside the sandbox if a caller
still passes a tuple.
samsja added a commit that referenced this pull request Apr 23, 2026
* rlm_harness: add summarize_at_tokens, drop rlm_max_turns_in_context

Plumbs the new rlm auto-compaction knob from kwarg -> env var. Accepts
an int for a fixed threshold or a (lo, hi) pair to draw a uniform
threshold per rollout/compaction; None leaves RLM_SUMMARIZE_AT_TOKENS
unset so rlm disables auto-compaction. Invalid shapes fail at
harness-build time instead of deep inside the sandbox.

rlm_max_turns_in_context and RLM_MAX_TURNS_IN_CONTEXT are gone
upstream (rlm now uses token-based compaction instead of turn caps).
Default rlm_tools drops "summarize" to match rlm's new tool set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ruff

* rlm_harness: summarize_at_tokens is int-only, drop (lo, hi) form

rlm/#54 dropped the randomized-threshold form from
_parse_summarize_at_tokens on the engine side, so the engine now
rejects "lo,hi" strings in RLM_SUMMARIZE_AT_TOKENS. Mirror that here:
the harness kwarg is int | None, the formatter emits only "N", and the
docstring no longer promises the tuple form.

Fails at harness-build time instead of inside the sandbox if a caller
still passes a tuple.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Sami Jaghouar <sami.jaghouar@hotmail.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants