fix: Preserve custom dataset fields in workflow output#1628
Conversation
Fixes NVIDIA#1385 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>
ValidationReproduced and validated locally before pushing: Bug reproduction (before fix): Fix validation (after fix): Full test suite: 10/10 dataset handler tests + 23/23 evaluate tests pass. |
|
No actionable comments were generated in the recent review. 🎉 WalkthroughModified the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/ok to test cc958a8 |
|
/merge |
Summary
Fixes #1385
When using
--skip_workflowto re-run evaluators on existing output, custom dataset fields likedifficultyandcategorywere getting dropped fromfull_dataset_entry. This happened becausepublish_eval_input()only wrote the 6 structured keys toworkflow_output.json, discarding everything else.The fix merges
full_dataset_entryas the base dict before overlaying the structured keys, so custom fields are preserved while structured fields still take precedence.Validation
Wrote a reproduction script that confirms:
workflow_output.jsononly contains[id, question, answer, generated_answer, intermediate_steps, expected_intermediate_steps]-- custom fields likedifficulty,category,tagsare missingworkflow_output.jsonand survive the--skip_workflowreloadgenerated_answer) still override any stale values fromfull_dataset_entryAll existing tests pass (10 dataset handler tests + 23 evaluate tests).
Test plan
full_dataset_entrywhen using--skip_workflowpublish_eval_input()Summary by CodeRabbit
Release Notes