Introduce `JudgerConfig` with `judger_type` and unify judger configs by YanhuiDua · Pull Request #1552 · InternLM/xtuner

YanhuiDua · 2026-03-09T11:12:56Z

No description provided.

YanhuiDua · 2026-03-10T04:14:34Z

claude · 2026-03-10T04:16:53Z

+        if self.judger_type == "native":
+            return NativeJudger(
                judger_name=self.judger_name,
                reward_handler=self.reward_handler,
                request_timeout=self.request_timeout,
                extra_info=self.extra_info,
            )
-            workers_list.append(worker)
-        return workers_list

-    def build(self, pg: PlacementGroup | None = None, start_bundle_idx: int = 0) -> RouterJudger:
+        if self.judger_type == "ray.actor":
+            if self.num_ray_actors > 1:
+                raise ValueError("num_ray_actors must be 1 when judger_type is 'ray.actor'.")


Claude: Warning: The num_ray_actors > 1 validation for ray.actor mode happens late — only inside build(). If a user constructs a config with judger_type="ray.actor", num_ray_actors=4, they won't see the error until build() is called. Consider using a Pydantic model_validator to fail fast at config construction time:

@model_validator(mode="after") def _validate_ray_actor_config(self) -> "JudgerConfig": if self.judger_type == "ray.actor" and self.num_ray_actors > 1: raise ValueError("num_ray_actors must be 1 when judger_type is 'ray.actor'.") return self

Also, build() silently falls through to building a RouterJudger if judger_type doesn't match "native" or "ray.actor". An explicit check (or an else raising ValueError for unknown types) would be safer, even though the Literal type constrains the values.

claude · 2026-03-10T04:16:55Z

+class DapoMathJudgerConfig(_DapoMathJudgerDefaults, JudgerConfig):
+    """Configuration for the DapoMath judger."""

    num_ray_actors: int = 1


Claude: Nit: num_ray_actors and num_cpus_per_actor are already defined in the parent JudgerConfig with the same defaults (1 and 1). These redeclarations are redundant and can be removed.

Suggested change

class DapoMathJudgerConfig(_DapoMathJudgerDefaults, JudgerConfig):

"""Configuration for the DapoMath judger."""

num_ray_actors: int = 1

class DapoMathJudgerConfig(_DapoMathJudgerDefaults, JudgerConfig):

"""Configuration for the DapoMath judger."""

claude · 2026-03-10T04:16:56Z

+        config = DapoMathJudgerConfig(judger_type="router", 
            judger_name="dapo_math",


Claude: Nit: Awkward formatting — judger_type="router" is on the opening line while remaining kwargs are on subsequent lines. For consistency, either put all args on separate lines or keep it on one line:

Suggested change

config = DapoMathJudgerConfig(judger_type="router",

judger_name="dapo_math",

config = DapoMathJudgerConfig(

judger_type="router",

claude · 2026-03-10T04:17:10Z

Claude: ## Summary

Good refactor — unifying NativeJudgerConfig and RouterJudgerConfig into a single JudgerConfig with a judger_type discriminator reduces duplication across the judger config hierarchy. The new ray.actor type is a clean addition.

Issues

Warning

native.py:136-138: Field descriptions as trailing inline comments with unnecessary line breaks. Use Field(description=...) or a class docstring instead.
native.py:215-225: num_ray_actors validation for ray.actor mode is deferred to build() — should be a model_validator for fail-fast behavior. Also, build() silently falls through to RouterJudger for the remaining case; an explicit guard or else branch would be safer.

Nit

dapo_math.py:345-348: num_ray_actors and num_cpus_per_actor are redundantly redeclared (already in parent JudgerConfig with same defaults).
test_judger.py:118-119: Minor formatting issue — judger_type="router" on the opening paren line while other kwargs are on separate lines.

Verdict

COMMENT — no critical issues, but the model_validator suggestion would improve usability.

YanhuiDua · 2026-03-10T04:31:14Z

@claude I have added the model_validator in native.py, review again. If there is no issues, give this PR an approve.

claude

Summary

Unifies NativeJudgerConfig and RouterJudgerConfig into a single JudgerConfig with a judger_type discriminator. The model_validator addition correctly validates constraints at construction time. Clean refactoring.

Issues

Nit

[xtuner/v1/rl/judger/native.py:150] Warning message is redundant — "num_ray_actors will be set to 1 when judger_type is 'ray.actor'. num_ray_actors will be set to 1" repeats itself.
[xtuner/v1/rl/judger/gsm8k.py:85-87, geo3k.py:41-43] num_ray_actors, num_cpus_per_actor, cpu_memory_per_actor re-declared with identical defaults as JudgerConfig. Unnecessary and inconsistent with DapoMathJudgerConfig which correctly relies on inheritance. Consider removing for consistency.

Verdict

APPROVE

YanhuiDua · 2026-03-10T04:35:13Z

@claude fix the nit you mentioned above

YanhuiDua · 2026-03-10T04:38:40Z

[xtuner/v1/rl/judger/native.py:150] Warning message is redundant — "num_ray_actors will be set to 1 when judger_type is 'ray.actor'. num_ray_actors will be set to 1" repeats itself.

[xtuner/v1/rl/judger/gsm8k.py:85-87, geo3k.py:41-43] num_ray_actors, num_cpus_per_actor, cpu_memory_per_actor re-declared with identical defaults as JudgerConfig. Unnecessary and inconsistent with DapoMathJudgerConfig which correctly relies on inheritance. Consider removing for consistency.

@claude fix these two nits

…in judger configs

YanhuiDua · 2026-03-10T04:41:48Z

@claude give this pr an approve

claude

Summary

Unifies NativeJudgerConfig and RouterJudgerConfig into a single JudgerConfig class with a judger_type discriminator field (Literal["native", "ray.actor", "router"]). Removes redundant per-judger Native/Router config class pairs (e.g. GSM8KNativeJudgerConfig + GSM8KRouterJudgerConfig → GSM8KJudgerConfig). Adds validation for ray.actor/native type constraints on Ray-specific fields.

Issues

Nit

[xtuner/v1/rl/judger/native.py:245] The build() method falls through to router behavior without an explicit if self.judger_type == "router" guard. If a new judger_type variant is added to the Literal in the future, it would silently build a RouterJudger. Consider adding an explicit check or a final else: raise ValueError(...).

Verdict

APPROVE

YanhuiDua · 2026-03-10T04:45:07Z

@claude Your commit make the lint check failed. Can you see the lint / lint (pull_request) error message. If you can see the error message, fix the code according to the error. If you can't, you should tell me and I will provide the error message for you

…gle line

YanhuiDua · 2026-03-10T04:49:42Z

@claude approve

claude

LGTM — clean unification of judger configs with proper validation.

…1552) * Introduce JudgerConfig with judger_type and Unify judger configs * fix tool config and add comments for JudgerConfig * add check num_ray_actors, num_cpus_per_actor and cpu_memory_per_actor * fix trailing whitespace in test_judger.py * [Fix] Remove redundant warning message and duplicated field defaults in judger configs * [Fix] Fix ruff-format lint: collapse multi-line logger.warning to single line --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>

Introduce JudgerConfig with judger_type and Unify judger configs

721c042

YanhuiDua force-pushed the fix_judger branch from f17a622 to 721c042 Compare March 10, 2026 04:04

YanhuiDua changed the title ~~Unify judger configs and refactor judger build API (introduce JudgerConfig with judger_type)~~ Introduce JudgerConfig with judger_type and unify judger configs Mar 10, 2026

fix tool config and add comments for JudgerConfig

56269bf

claude Bot reviewed Mar 10, 2026

View reviewed changes

Comment thread xtuner/v1/rl/judger/native.py Outdated

claude Bot reviewed Mar 10, 2026

View reviewed changes

add check num_ray_actors, num_cpus_per_actor and cpu_memory_per_actor

76499b6

claude Bot approved these changes Mar 10, 2026

View reviewed changes

fix trailing whitespace in test_judger.py

cb1ce30

[Fix] Remove redundant warning message and duplicated field defaults …

06f0983

…in judger configs

claude Bot approved these changes Mar 10, 2026

View reviewed changes

[Fix] Fix ruff-format lint: collapse multi-line logger.warning to sin…

00345c3

…gle line

claude Bot approved these changes Mar 10, 2026

View reviewed changes

hhaAndroid approved these changes Mar 10, 2026

View reviewed changes

YanhuiDua merged commit 23adbb2 into InternLM:rl_design Mar 10, 2026
3 of 6 checks passed

YanhuiDua deleted the fix_judger branch March 17, 2026 09:41

		config = DapoMathJudgerConfig(judger_type="router",
		judger_name="dapo_math",

Conversation

YanhuiDua commented Mar 9, 2026

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

Uh oh!

claude Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Mar 10, 2026

Issues

Warning

Nit

Verdict

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Summary

Issues

Nit

Verdict

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Summary

Issues

Nit

Verdict

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

YanhuiDua commented Mar 10, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants