Skip to content

fix(evaluator): register plugin job result artifacts#47

Merged
SandyChapman merged 1 commit into
mainfrom
fix-evaluator-plugin-result-registration/schapman
May 28, 2026
Merged

fix(evaluator): register plugin job result artifacts#47
SandyChapman merged 1 commit into
mainfrom
fix-evaluator-plugin-result-registration/schapman

Conversation

@SandyChapman
Copy link
Copy Markdown
Contributor

Summary

  • register evaluator plugin aggregate-scores, row-scores, artifacts, and full evaluation-results through the job result sink
  • write aggregate and row score artifacts alongside the full SDK result payload
  • cover the result registration behavior in evaluator job tests

Validation

  • uv run --frozen ruff check plugins/nemo-evaluator/src/nemo_evaluator/jobs/evaluate.py plugins/nemo-evaluator/tests/test_evaluate_job.py
  • uv run --frozen --extra cpu ty check plugins/nemo-evaluator/src/nemo_evaluator/jobs/evaluate.py plugins/nemo-evaluator/tests/test_evaluate_job.py
  • uv run --frozen pytest plugins/nemo-evaluator/tests/test_evaluate_job.py -q
  • uv run --frozen pytest plugins/nemo-evaluator/tests/test_sdk_job_resources.py -q
  • tools/lint/lint-all.sh

Signed-off-by: Sandy Chapman <schapman@nvidia.com>
@SandyChapman SandyChapman marked this pull request as ready for review May 26, 2026 14:23
@SandyChapman SandyChapman requested review from a team as code owners May 26, 2026 14:23
@github-actions
Copy link
Copy Markdown
Contributor

Suite Lines Covered Line Rate Branch Rate
Unit Tests 18243/24193 75.4% 61.8%
Integration Tests 11666/22975 50.8% 25.9%

@SandyChapman SandyChapman added this pull request to the merge queue May 28, 2026
Merged via the queue into main with commit 5cddbeb May 28, 2026
11 checks passed
aray12 pushed a commit that referenced this pull request May 28, 2026
Signed-off-by: Sandy Chapman <schapman@nvidia.com>
Signed-off-by: Alex Ray <alray@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants