fix(ai): normalize boolean scores in onlineEval scoresSummary#263
Merged
lukasmalkmus merged 1 commit intomainfrom Feb 25, 2026
Merged
fix(ai): normalize boolean scores in onlineEval scoresSummary#263lukasmalkmus merged 1 commit intomainfrom
lukasmalkmus merged 1 commit intomainfrom
Conversation
When a scorer returned `{ score: true }` or `{ score: false }`, the
parent eval span's `eval.case.scores` attribute contained raw booleans
instead of normalized numeric values with `eval.score.is_boolean`
metadata. This was inconsistent with individual scorer child spans
which already called `normalizeBooleanScore()` via `executor.ts`.
Apply the same normalization when building `scoresSummary` so both
the parent eval span and child scorer spans produce consistent
numeric scores.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
commit: |
thesollyz
approved these changes
Feb 25, 2026
lukasmalkmus
pushed a commit
that referenced
this pull request
Feb 25, 2026
🤖 I have created a release *beep* *boop* --- ## [0.46.1](axiom-v0.46.0...axiom-v0.46.1) (2026-02-25) ### Bug Fixes * **ai:** move online eval scorer counters to eval.* namespace ([#264](#264)) ([bef94db](bef94db)) * **ai:** normalize boolean scores in onlineEval scoresSummary ([#263](#263)) ([ff75842](ff75842)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Release metadata/changelog-only changes with no functional code modifications in this PR. > > **Overview** > Publishes `packages/ai` version `0.46.1` by updating the release manifest, `package.json` version, and `CHANGELOG.md`. > > The changelog for `0.46.1` notes two bug fixes: moving online eval scorer counters into the `eval.*` namespace and normalizing boolean scores in `onlineEval` `scoresSummary`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 90b0bd1. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
onlineEval()was writing raw boolean scores (true/false) into the parent eval span'seval.case.scoresattribute, while child scorer spans correctly normalized them to1/0witheval.score.is_booleanmetadata vianormalizeBooleanScore()normalizeBooleanScore()call when buildingscoresSummaryso both parent and child spans produce consistent numeric scoresNote
Low Risk
Small telemetry-only change that affects how scores are serialized into span attributes; low risk aside from potential downstream expectations of boolean values.
Overview
Ensures
onlineEval()writes consistent numeric scores into the parent eval span’seval.case.scoressummary by normalizing booleanscorevalues (true/false→1/0) and propagating the correspondingeval.score.is_booleanmetadata.This updates
onlineEval.tsto callnormalizeBooleanScore()while buildingscoresSummary, and only emits normalized metadata when non-empty.Written by Cursor Bugbot for commit bfa6ce7. This will update automatically on new commits. Configure here.