[pull] main from inclusionAI:main#42
Merged
pull[bot] merged 3 commits intoaxistore80-coder:mainfrom Apr 20, 2026
Merged
Conversation
…1183) * feat(service): add external model API support for inference service * fix(service): address review feedback for external model API Key changes: - Default model field to "default" and validate non-empty on init - Remove redundant if-else in chat_completion; always set model/api_key - Revert cosmetic variable renames in workflow _run_online - Replace register_external_model with lazy get_or_create_session - Remove is_external_api; set needs_online_callback=True always - Add test for empty model validation * refactor(service): unify session resolution to bearer token auth Remove model-name-based session lookup in set_reward and chat/completions endpoints. All session resolution now uses bearer token, and external model interactions are recorded on the token-resolved session instead of auto-created per-model sessions. Key changes: - Simplify set_reward to use only bearer token auth - Move token extraction before external model dispatch - Remove unused SessionStore.get_or_create_session method - Add docstring to SessionData.export_interactions - Update tests to pass bearer token for external model flows * fix(service): remove unused imports and variable in data proxy Clean up lint issues flagged by Ruff: unused imports (orjson, Body, JSONResponse) and unused local variable in register_model. * fix(examples): fix external model routing and result printing in HITL demo Zeroclaw was not sending the correct model name because _patch_zeroclaw_config did not set default_model, causing requests to miss the registered external model in the data proxy and fall through to the non-existent internal path. The result printing also crashed because concat_string_interactions returns {"interactions": [...]}, not InteractionWithTokenLogpReward objects. Key changes: - Set default_model in zeroclaw config when --model is provided - Rename CLI args from --external-* to --api-url/--provider-api-key/--model - Fix result printing to use traj.get("interactions") dict access - Remove model param from _set_reward and _do_round (unused) - Restore original comments and step numbering
* feat: add scaffolding rollout workflow Key design: #818 Co-Authored-By: narutolhy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(examples): fix mdformat line wrapping in scaffolding README --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: 博惟 <bowei.fw@antgroup.com>
) Replace behave_imp_weight_cap/behave_imp_weight_mode with unified RejectionSamplingConfig supporting multiple metrics (ratio, kl_k1, kl_k2, kl_k3), levels (token/sequence), and actions (mask/clamp). Key changes: - Add RejectionSamplingConfig dataclass with comprehensive validation - Implement apply_rejection_sampling for 1D packed and 2D padded formats - Fix loss denominator scaling bug in mask mode (save count before filtering) - Use geometric mean for sequence-level ratio aggregation (matching GSPO) - Broadcast sequence-level geometric mean as uniform behave_imp_weight - Warn when use_decoupled_loss=True but rejection_sampling is None - Update ppo_actor_loss_fn and grpo_loss_fn to use new config - Migrate 40 example configs to new rejection_sampling field - Add 43 unit tests covering all modes, metrics, and edge cases Refs: #1052
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )