Skip to content

Keep harness metrics merge inside experimental composable env#1201

Merged
snimu merged 1 commit intomainfrom
sebastian/undo-rubrics-changes-2026-04-20
Apr 20, 2026
Merged

Keep harness metrics merge inside experimental composable env#1201
snimu merged 1 commit intomainfrom
sebastian/undo-rubrics-changes-2026-04-20

Conversation

@snimu
Copy link
Copy Markdown
Contributor

@snimu snimu commented Apr 20, 2026

Description

Previous PR touched core verifiers logic (in non-harmful ways, but still); this undoes those changes and solves the problem they were meant to address fully inside experimental

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Medium Risk
Changes how rewards are aggregated in RubricGroup by no longer coercing None rewards to 0.0, which could surface type errors if any rubric returns None. The new harness-metrics merge runs during rubric cleanup in the experimental env and should be low impact outside that path.

Overview
Moves harness metrics merging fully into the experimental ComposableEnv by replacing the standalone HarnessMetricsRubric (and its @cleanup decorator) with a HarnessMetricsRubricGroup that runs child rubric cleanup then folds _harness_metrics into state["metrics"].

When harness.metrics_path is set, ComposableEnv now wraps the existing rubric (or existing RubricGroup) inside this new group so metrics are merged without touching core scoring flow.

Separately, RubricGroup no longer normalizes None rewards to 0.0 during score_rollout/score_group, relying on rubrics to always set numeric rewards.

Reviewed by Cursor Bugbot for commit 1ae31b4. Bugbot is set up for automated code reviews on this repo. Configure here.

@snimu snimu merged commit 0c1b133 into main Apr 20, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant