Skip to content

feat: port core SDLC graders to hud.native#381

Merged
lorenss-m merged 5 commits into
mainfrom
nancy/sdlc-grader
Mar 27, 2026
Merged

feat: port core SDLC graders to hud.native#381
lorenss-m merged 5 commits into
mainfrom
nancy/sdlc-grader

Conversation

@nancyjlau
Copy link
Copy Markdown
Contributor

@nancyjlau nancyjlau commented Mar 24, 2026

ported the core generic grader primitives from hud-sdlc-lib into hud.native so env authors can use first-party evaluators in the SDK without depending on SDLC specific code

added hud/native/graders.py with: Grade.from_subscores, Grader, BashGrader, Grader.any / Grader.all

instead of copying SDLC local types, here we reuse hud-python's existing SubScore / EvaluationResult types. Grade.from_subscores normalizes positive weights to sum to 1.0 and preserves negative weights as penalties so it fits the current hud-python evaluation semantics. no SDLC specific graders are ported here

there are tests in hud/native/tests/test_graders.py


Note

Medium Risk
Introduces new grading utilities that execute arbitrary shell commands via BashGrader, which can impact safety/behavior depending on how scenarios use it. Also changes telemetry argument/result serialization, which could affect what data is emitted in traces.

Overview
Adds first-party grading primitives under hud.native (Grade.from_subscores, Grader with any/all, and BashGrader) to build EvaluationResult objects from SubScores, including weight normalization, penalty handling, name de-duping, and metadata propagation.

Introduces hud.utils.serialization (json_safe_value/json_safe_dict) and switches telemetry instrumentation to use it for more robust JSON-safe span serialization; adds test coverage and new docs page wired into the docs nav.

Written by Cursor Bugbot for commit f93def9. This will update automatically on new commits. Configure here.

@nancyjlau nancyjlau requested a review from lorenss-m March 24, 2026 18:38
Comment thread hud/native/graders.py Outdated
Comment thread hud/native/graders.py Outdated
Comment thread hud/native/graders.py Outdated
result[key] = value
except (TypeError, ValueError):
result[key] = f"<{type(value).__name__}: not serializable>"
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely we have something like this in the rest of the codebase we can reuse

Copy link
Copy Markdown
Contributor

@lorenss-m lorenss-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment in the file, otherwise would also be nice to add this into the docs! Make a SDK reference file for this

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread hud/native/graders.py
@lorenss-m lorenss-m merged commit dff3d0c into main Mar 27, 2026
10 checks passed
@nancyjlau nancyjlau deleted the nancy/sdlc-grader branch March 27, 2026 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants