Fix Eval trace structure inconsistencies by delner · Pull Request #119 · braintrustdata/braintrust-sdk-ruby

David Elner (delner) · 2026-03-17T01:13:23Z

There are numerous issues with the Eval trace structure when compared to the Java SDK (the other OTel-based implementation) Python, and TypeScript traces.

Changes

Issue	Before	After
Single shared score span	All scorers ran inside one `"score"` span with all scores aggregated on it	Each scorer gets its own `"score"` span as a direct child of the eval span, matching Java/Python/TS
Missing `purpose: "scorer"` on score spans	`span_attributes` only had `{type: "score"}`	`span_attributes` includes `{type: "score", name: scorer_name, purpose: "scorer"}` — used by the platform to filter scorer spans from cost/latency calculations
Missing scorer input/output on score spans	Score span had no `input_json` or `output_json`	Each score span logs `input_json` (input, expected, output, metadata) and `output_json` (scores hash), matching Python/TS expected output
Eval span `input_json` not wrapped	Raw value (e.g., `"hello"`)	Wrapped as `{input: "hello"}`, matching Java SDK
Eval span `output_json` not wrapped	Raw value (e.g., `"HELLO"`)	Wrapped as `{output: "HELLO"}`, matching Java SDK
Missing `metadata` on eval span	Case metadata not logged on the eval span	Case metadata set as `braintrust.metadata` on the eval span, matching Java SDK
Missing `output_json` on eval span when task errors	No `output_json` attribute set at all	Sets `{output: null}`, matching Java SDK
Eval span attributes missing on task error	`span_attributes`, `input_json`, `expected`, `metadata`, `origin` were set after task+scorers, so they were skipped on task error	All known attributes set before task execution so they're present regardless of task outcome
Eval spans not isolated from ambient trace context	Used `tracer.in_span("eval")` which inherits any active parent span (e.g., a Sidekiq job span)	Uses `tracer.start_root_span("eval")` so each eval case starts its own independent trace, matching Java's `setNoParent()`

Fixed: Eval trace format inconsistent with other SDKs

74b588b

David Elner (delner) requested review from Abhijeet Prasad (AbhiPrasad), Matt Perpick (clutchski) and Andrew Kent (realark) March 17, 2026 01:13

David Elner (delner) self-assigned this Mar 17, 2026

David Elner (delner) added the bug Something isn't working label Mar 17, 2026

Abhijeet Prasad (AbhiPrasad) approved these changes Mar 17, 2026

View reviewed changes

David Elner (delner) mentioned this pull request Mar 17, 2026

Add purpose tag to Eval score spans braintrustdata/braintrust-sdk-java#48

Merged

David Elner (delner) merged commit 2d9c2a7 into main Mar 17, 2026
7 checks passed

David Elner (delner) deleted the fix/eval_trace_structure branch March 17, 2026 20:03

This was referenced Mar 17, 2026

Scorer spans use the scorer name (not "score") #121

Merged

Align trace structure for Evals with other SDKs braintrustdata/braintrust-sdk-go#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Eval trace structure inconsistencies#119

Fix Eval trace structure inconsistencies#119
David Elner (delner) merged 1 commit intomainfrom
fix/eval_trace_structure

David Elner (delner) commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

David Elner (delner) commented Mar 17, 2026

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants