feat: simplify evaluation schema to flat score/reasoning shape by jsonbailey · Pull Request #1286 · launchdarkly/js-core

jsonbailey · 2026-04-16T18:22:05Z

Summary

Removed the metric key from the structured output schema. EvaluationSchemaBuilder.build() no longer takes an evaluationMetricKey parameter. Since there is only ever a single evaluation metric key per judge config, it does not need to be embedded in the schema sent to the LLM.
Flattened the schema to a top-level {score, reasoning} shape. The old nested structure ({evaluations: {metricKey: {score, reasoning}}}) is replaced with a simple {score: number, reasoning: string} object. This is easier for LLMs to produce correctly and matches the Python SDK (fix: Remove evaluation metric key from schema which failed on some LLMs python-server-sdk-ai#105).
Updated parsing in Judge.ts. _parseEvaluationResponse now reads score and reasoning directly from the top-level response data. The metric key is still sourced from the judge config's evaluationMetricKey and used to key the result — it just no longer appears in the schema or LLM response.

Test plan

All 144 existing tests pass (yarn workspace @launchdarkly/server-sdk-ai test)
Lint passes (yarn workspace @launchdarkly/server-sdk-ai lint)
Test mocks updated to use new flat response shape
_parseEvaluationResponse unit tests updated for simplified signature and data shape

🤖 Generated with Claude Code

Note

Medium Risk
Changes the structured response contract and parsing for judge evaluations; any callers/providers still emitting the old nested evaluations shape will now fail evaluation parsing.

Overview
Simplifies judge structured-output handling by switching the expected/provider schema from nested evaluations[metricKey]{score,reasoning} to a flat top-level {score, reasoning} object, and removes the dynamic EvaluationSchemaBuilder entirely.

Judge.evaluate now always invokes the provider with the static schema and parses score/reasoning directly; failures log a more specific "Could not parse evaluation response" warning. Tests are updated to use the new response shape and to assert the new warning behavior for missing/malformed responses.

^{Reviewed by Cursor Bugbot for commit 013a80d. Bugbot is set up for automated code reviews on this repo. Configure here.}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-04-16T18:23:44Z

@launchdarkly/js-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 25623 bytes
Compressed size limit: 29000
Uncompressed size: 125843 bytes

github-actions · 2026-04-16T18:23:56Z

@launchdarkly/js-client-sdk size report
This is the brotli compressed size of the ESM build.
Compressed size: 31655 bytes
Compressed size limit: 34000
Uncompressed size: 112792 bytes

github-actions · 2026-04-16T18:24:01Z

@launchdarkly/browser size report
This is the brotli compressed size of the ESM build.
Compressed size: 179375 bytes
Compressed size limit: 200000
Uncompressed size: 829982 bytes

github-actions · 2026-04-16T18:24:04Z

@launchdarkly/js-client-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 37169 bytes
Compressed size limit: 38000
Uncompressed size: 204305 bytes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Delete EvaluationSchemaBuilder.ts and define EVALUATION_SCHEMA as a module-level const in Judge.ts. Remove per-field warnings from _parseEvaluationResponse (keep it pure) and emit a single warning in evaluate() that includes the judge key and raw response data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit d81b202. Configure here.}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

configKey is already present in tracker.getTrackData(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joker23

only nits

Address review nits: narrow EVALUATION_SCHEMA type with as const instead of Record<string, unknown>, and add Array.isArray check in _parseEvaluationResponse. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: simplify evaluation schema to flat score/reasoning shape

ea9f2d6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey and others added 2 commits April 16, 2026 16:07

chore: remove unnecessary comment from EvaluationSchemaBuilder

f3a8bf3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey marked this pull request as ready for review April 16, 2026 21:55

jsonbailey requested a review from a team as a code owner April 16, 2026 21:55

cursor bot reviewed Apr 16, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated

jsonbailey and others added 2 commits April 17, 2026 09:47

fix: include tracker data in parse-failure warning

e115ec8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: remove redundant judge key from parse-failure warning

b626773

configKey is already present in tracker.getTrackData(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joker23 approved these changes Apr 17, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated

Comment thread packages/sdk/server-ai/src/api/judge/Judge.ts Outdated

chore: use as const for schema, add array guard in parse

013a80d

Address review nits: narrow EVALUATION_SCHEMA type with as const instead of Record<string, unknown>, and add Array.isArray check in _parseEvaluationResponse. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey merged commit 524c99e into feat/ai-sdk-next-release Apr 17, 2026
44 checks passed

jsonbailey deleted the jb/aic-2253/simplify-eval-schema branch April 17, 2026 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: simplify evaluation schema to flat score/reasoning shape#1286

feat: simplify evaluation schema to flat score/reasoning shape#1286
jsonbailey merged 6 commits intofeat/ai-sdk-next-releasefrom
jb/aic-2253/simplify-eval-schema

jsonbailey commented Apr 16, 2026 •

edited by cursor bot

Loading

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

joker23 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jsonbailey commented Apr 16, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joker23 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jsonbailey commented Apr 16, 2026 •

edited by cursor bot

Loading