Skip to content

feat: bake sampling_rate into Judge at construction; simplify Evaluator to List[Judge]#159

Merged
jsonbailey merged 3 commits intomainfrom
jb/aic-2388/judge-sample-rate
May 1, 2026
Merged

feat: bake sampling_rate into Judge at construction; simplify Evaluator to List[Judge]#159
jsonbailey merged 3 commits intomainfrom
jb/aic-2388/judge-sample-rate

Conversation

@jsonbailey
Copy link
Copy Markdown
Contributor

@jsonbailey jsonbailey commented May 1, 2026

Summary

  • Judge.__init__ gains sample_rate: float = 1.0; stored as self.sample_rate
  • Judge.evaluate(sampling_rate=None) and evaluate_messages(sampling_ratio=None) fall back to self.sample_rate when the arg is omitted; an explicit 0.0 still overrides correctly
  • Evaluator constructor simplified to List[Judge] only — JudgeConfiguration removed from the constructor and evaluate path
  • Evaluator.noop() returns cls([])
  • _initialize_judges deleted; _build_evaluator inlined via a private _create_judge_instance helper shared with create_judge so the public create_judge surface is unchanged and no usage-tracking event is emitted during internal evaluator construction
  • 4 new tests covering sample_rate default, explicit value, instance-rate fallback, and per-call override

Test plan

  • 146 server-ai tests pass
  • 81 langchain tests pass, 40 openai tests pass (Evaluator.noop() callers unaffected)
  • mypy, isort, pycodestyle clean

🤖 Generated with Claude Code


Note

Medium Risk
Changes how judge sampling rates are applied and refactors Evaluator/judge construction, which can alter when evaluations run and could affect tracking or performance if misconfigured. Scope is limited to judge/evaluator plumbing and is covered by added unit tests.

Overview
Bakes per-judge sampling into Judge instances by adding a sample_rate constructor arg and making Judge.evaluate()/evaluate_messages() default to the instance rate when no per-call rate is provided.

Simplifies Evaluator to hold a List[Judge] (removing JudgeConfiguration from the execution path) and updates LDAIClient._build_evaluator() to materialize judges inline via a new _create_judge_instance() helper that avoids emitting the public create-judge usage event during internal evaluator construction.

Adds tests covering sample_rate defaulting, constructor override, instance-rate fallback, and per-call override behavior.

Reviewed by Cursor Bugbot for commit 13b38fd. Bugbot is set up for automated code reviews on this repo. Configure here.

@jsonbailey jsonbailey marked this pull request as ready for review May 1, 2026 13:37
@jsonbailey jsonbailey requested a review from a team as a code owner May 1, 2026 13:37
Comment thread packages/sdk/server-ai/src/ldai/client.py Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2d80afe. Configure here.

Comment thread packages/sdk/server-ai/src/ldai/judge/__init__.py Outdated
@jsonbailey jsonbailey merged commit 86c79e6 into main May 1, 2026
40 of 45 checks passed
@jsonbailey jsonbailey deleted the jb/aic-2388/judge-sample-rate branch May 1, 2026 14:58
@github-actions github-actions Bot mentioned this pull request May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants