fix: Preserve custom dataset fields in workflow output by bledden · Pull Request #1628 · NVIDIA/NeMo-Agent-Toolkit

bledden · 2026-02-21T09:02:04Z

Summary

When using --skip_workflow to re-run evaluators on existing output, custom dataset fields like difficulty and category were getting dropped from full_dataset_entry. This happened because publish_eval_input() only wrote the 6 structured keys to workflow_output.json, discarding everything else.

The fix merges full_dataset_entry as the base dict before overlaying the structured keys, so custom fields are preserved while structured fields still take precedence.

Validation

Wrote a reproduction script that confirms:

Before fix: workflow_output.json only contains [id, question, answer, generated_answer, intermediate_steps, expected_intermediate_steps] -- custom fields like difficulty, category, tags are missing
After fix: all custom fields are preserved in workflow_output.json and survive the --skip_workflow reload
Structured fields (like generated_answer) still override any stale values from full_dataset_entry

All existing tests pass (10 dataset handler tests + 23 evaluate tests).

Test plan

Reproduce bug: custom fields missing from full_dataset_entry when using --skip_workflow
Validate fix: custom fields preserved after round-trip through publish_eval_input()
Existing test suite passes
ruff + yapf clean

Summary by CodeRabbit

Release Notes

Bug Fixes
- Fixed data preservation in structured input processing to ensure additional dataset fields are now retained during workflow operations and round-trip processing.

Fixes NVIDIA#1385 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>

copy-pr-bot · 2026-02-21T09:02:09Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

bledden · 2026-02-21T09:02:14Z

Validation

Reproduced and validated locally before pushing:

Bug reproduction (before fix):

Published workflow_output.json fields:
  Keys: ['id', 'question', 'answer', 'generated_answer', 'intermediate_steps', 'expected_intermediate_steps']
BUG CONFIRMED: Custom fields 'difficulty' and 'category' are MISSING

Fix validation (after fix):

Published workflow_output.json fields:
  Keys: ['id', 'question', 'answer', 'difficulty', 'category', 'tags', 'generated_answer', 'intermediate_steps', 'expected_intermediate_steps']
FIX WORKS: Custom fields are preserved
  Structured fields also correct (output_obj wins over full_dataset_entry)

Full test suite: 10/10 dataset handler tests + 23/23 evaluate tests pass.

coderabbitai · 2026-02-21T09:02:21Z

No actionable comments were generated in the recent review. 🎉

Walkthrough

Modified the parse_if_json_string function to spread item.full_dataset_entry into the output dictionary before setting standard keys, ensuring additional fields from the original dataset are preserved during structured input processing and --skip_workflow operations.

Changes

Cohort / File(s)	Summary
Dataset Handler Structured Input `packages/nvidia_nat_eval/src/nat/plugins/eval/dataset_handler/dataset_handler.py`	Changed `parse_if_json_string` to spread `item.full_dataset_entry` dictionary into the output item before assigning standard keys (id, question, answer, generated_answer, trajectory, expected_trajectory), preserving additional custom fields from the original dataset.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: preserving custom dataset fields in workflow output, uses imperative mood with the 'fix:' prefix, and is well within the 72-character limit at 54 characters.
Linked Issues check	✅ Passed	The PR fully addresses `#1385` by modifying parse_if_json_string to merge full_dataset_entry as the base dictionary before overlaying structured fields, ensuring custom fields from the original dataset are preserved during --skip_workflow operations.
Out of Scope Changes check	✅ Passed	All changes in dataset_handler.py are directly scoped to fixing the issue: only the parse_if_json_string function logic was modified to preserve custom fields, with no unrelated alterations introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

willkill07 · 2026-02-21T13:48:09Z

/ok to test cc958a8

willkill07 · 2026-02-22T03:01:05Z

/merge

fix: Preserve custom dataset fields in workflow_output.json

cc958a8

Fixes NVIDIA#1385 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>

bledden requested a review from a team as a code owner February 21, 2026 09:02

willkill07 added bug Something isn't working non-breaking Non-breaking change labels Feb 21, 2026

willkill07 approved these changes Feb 21, 2026

View reviewed changes

willkill07 self-assigned this Feb 21, 2026

rapids-bot bot merged commit 1461190 into NVIDIA:develop Feb 22, 2026
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Preserve custom dataset fields in workflow output#1628

fix: Preserve custom dataset fields in workflow output#1628
rapids-bot[bot] merged 1 commit intoNVIDIA:developfrom
bledden:fix/skip-workflow-preserve-custom-fields

bledden commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 21, 2026

Uh oh!

bledden commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

Uh oh!

willkill07 commented Feb 21, 2026

Uh oh!

willkill07 commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bledden commented Feb 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Feb 21, 2026

Uh oh!

bledden commented Feb 21, 2026

Validation

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

willkill07 commented Feb 21, 2026

Uh oh!

willkill07 commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bledden commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 21, 2026 •

edited

Loading