[pull] main from inclusionAI:main by pull[bot] · Pull Request #42 · axistore80-coder/AReaL

pull · 2026-04-20T07:21:28Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…1183) * feat(service): add external model API support for inference service * fix(service): address review feedback for external model API Key changes: - Default model field to "default" and validate non-empty on init - Remove redundant if-else in chat_completion; always set model/api_key - Revert cosmetic variable renames in workflow _run_online - Replace register_external_model with lazy get_or_create_session - Remove is_external_api; set needs_online_callback=True always - Add test for empty model validation * refactor(service): unify session resolution to bearer token auth Remove model-name-based session lookup in set_reward and chat/completions endpoints. All session resolution now uses bearer token, and external model interactions are recorded on the token-resolved session instead of auto-created per-model sessions. Key changes: - Simplify set_reward to use only bearer token auth - Move token extraction before external model dispatch - Remove unused SessionStore.get_or_create_session method - Add docstring to SessionData.export_interactions - Update tests to pass bearer token for external model flows * fix(service): remove unused imports and variable in data proxy Clean up lint issues flagged by Ruff: unused imports (orjson, Body, JSONResponse) and unused local variable in register_model. * fix(examples): fix external model routing and result printing in HITL demo Zeroclaw was not sending the correct model name because _patch_zeroclaw_config did not set default_model, causing requests to miss the registered external model in the data proxy and fall through to the non-existent internal path. The result printing also crashed because concat_string_interactions returns {"interactions": [...]}, not InteractionWithTokenLogpReward objects. Key changes: - Set default_model in zeroclaw config when --model is provided - Rename CLI args from --external-* to --api-url/--provider-api-key/--model - Fix result printing to use traj.get("interactions") dict access - Remove model param from _set_reward and _do_round (unused) - Restore original comments and step numbering

* feat: add scaffolding rollout workflow Key design: #818 Co-Authored-By: narutolhy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(examples): fix mdformat line wrapping in scaffolding README --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: 博惟 <bowei.fw@antgroup.com>

) Replace behave_imp_weight_cap/behave_imp_weight_mode with unified RejectionSamplingConfig supporting multiple metrics (ratio, kl_k1, kl_k2, kl_k3), levels (token/sequence), and actions (mask/clamp). Key changes: - Add RejectionSamplingConfig dataclass with comprehensive validation - Implement apply_rejection_sampling for 1D packed and 2D padded formats - Fix loss denominator scaling bug in mask mode (save count before filtering) - Use geometric mean for sequence-level ratio aggregation (matching GSPO) - Broadcast sequence-level geometric mean as uniform behave_imp_weight - Warn when use_decoupled_loss=True but rejection_sampling is None - Update ppo_actor_loss_fn and grpo_loss_fn to use new config - Migrate 40 example configs to new rejection_sampling field - Add 43 unit tests covering all modes, metrics, and edge cases Refs: #1052

nuzant and others added 3 commits April 20, 2026 13:59

pull bot locked and limited conversation to collaborators Apr 20, 2026

pull bot added the ⤵️ pull label Apr 20, 2026

pull bot merged commit bc9f009 into axistore80-coder:main Apr 20, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from inclusionAI:main#42

[pull] main from inclusionAI:main#42
pull[bot] merged 3 commits intoaxistore80-coder:mainfrom
inclusionAI:main

pull bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pull bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pull bot commented Apr 20, 2026 •

edited

Loading