Skip to content

fix: Skip output directory cleanup when --skip_workflow is set#1627

Merged
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
bledden:fix/eval-skip-workflow-cleanup
Feb 22, 2026
Merged

fix: Skip output directory cleanup when --skip_workflow is set#1627
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
bledden:fix/eval-skip-workflow-cleanup

Conversation

@bledden
Copy link
Contributor

@bledden bledden commented Feb 21, 2026

Summary

Fixes #1587

I ran into this while testing eval workflows: running nat eval --skip_workflow without --dataset was silently deleting the workflow_output.json from a previous run. The cleanup step in run_and_evaluate() runs before the dataset is loaded, so by the time the code tries to load the previous output, it's already been wiped by shutil.rmtree().

Since --skip_workflow exists specifically to reuse existing workflow output, it doesn't make sense to clean up the output directory when that flag is set. This change skips cleanup when --skip_workflow is active and logs an info message explaining why.

Normal eval behavior (without --skip_workflow) is unchanged.

Test plan

  • Run nat eval normally, confirm output directory cleanup still works
  • Run nat eval --skip_workflow after a previous run, confirm workflow_output.json is preserved
  • Existing eval tests pass

Summary by CodeRabbit

  • Bug Fixes
    • Corrected output directory cleanup so it is skipped when the workflow is intentionally bypassed, preserving generated files.
    • Added user-facing logs that indicate when cleanup operations are being skipped to improve transparency.

When running `nat eval --skip_workflow`, the output directory was being
cleaned up before the dataset was loaded, destroying the workflow_output.json
that the user intended to evaluate. The --skip_workflow flag exists to
reuse previous output, so cleaning it up is contradictory.

Skip cleanup when --skip_workflow is set and log an info message so the
user knows why cleanup was skipped.

Closes NVIDIA#1587

Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>
@bledden bledden requested a review from a team as a code owner February 21, 2026 07:57
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

No actionable comments were generated in the recent review. 🎉


Walkthrough

Output directory cleanup in run_and_evaluate() is now conditional: cleanup_output_directory() is invoked only when self.eval_config.general.output is truthy and self.config.skip_workflow is False; if skip_workflow is True, cleanup is skipped and a log message is emitted.

Changes

Cohort / File(s) Summary
Output cleanup logic
packages/nvidia_nat_eval/src/nat/plugins/eval/runtime/evaluate.py
Changed run_and_evaluate() to call cleanup_output_directory() only when self.eval_config.general.output is truthy and self.config.skip_workflow is False. Added logging to indicate when cleanup is skipped due to skip_workflow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely describes the fix using imperative mood and is well within the 72-character limit at 62 characters.
Linked Issues check ✅ Passed The PR successfully addresses issue #1587 by preventing output directory cleanup when --skip_workflow is set, preserving existing workflow_output.json as required.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the --skip_workflow cleanup issue; no unrelated modifications to other functionality were introduced.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/nvidia_nat_eval/src/nat/plugins/eval/runtime/evaluate.py (1)

588-592: Misleading log when output config is absent.

The elif self.config.skip_workflow: branch fires whenever skip_workflow is True, regardless of whether self.eval_config.general.output is set. When output is falsy, cleanup would have been skipped anyway, so the log message "Skipping output directory cleanup because --skip_workflow is set" is misleading — it implies cleanup was about to happen.

Tighten the guard so the log only emits when cleanup would have actually been performed:

♻️ Proposed fix
-        # Cleanup the output directory (skip when reusing existing workflow output)
-        if self.eval_config.general.output and not self.config.skip_workflow:
-            self.cleanup_output_directory()
-        elif self.config.skip_workflow:
-            logger.info("Skipping output directory cleanup because --skip_workflow is set")
+        # Cleanup the output directory (skip when reusing existing workflow output)
+        if self.eval_config.general.output:
+            if self.config.skip_workflow:
+                logger.info("Skipping output directory cleanup because --skip_workflow is set")
+            else:
+                self.cleanup_output_directory()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/nvidia_nat_eval/src/nat/plugins/eval/runtime/evaluate.py` around
lines 588 - 592, The log is emitted even when output is falsy, which is
misleading; change the conditional so the "Skipping output directory cleanup
because --skip_workflow is set" message only appears when cleanup would have
happened (i.e., when self.eval_config.general.output is truthy) — update the
branches around cleanup_output_directory(), referencing
self.eval_config.general.output and self.config.skip_workflow (and the
cleanup_output_directory() call and logger.info) to check output &&
skip_workflow for the log path and output && !skip_workflow for performing
cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/nvidia_nat_eval/src/nat/plugins/eval/runtime/evaluate.py`:
- Around line 588-592: The log is emitted even when output is falsy, which is
misleading; change the conditional so the "Skipping output directory cleanup
because --skip_workflow is set" message only appears when cleanup would have
happened (i.e., when self.eval_config.general.output is truthy) — update the
branches around cleanup_output_directory(), referencing
self.eval_config.general.output and self.config.skip_workflow (and the
cleanup_output_directory() call and logger.info) to check output &&
skip_workflow for the log path and output && !skip_workflow for performing
cleanup.

@bledden
Copy link
Contributor Author

bledden commented Feb 21, 2026

Validation

I wrote a targeted test to both reproduce the bug and validate the fix:

Bug reproduction (reverted fix):

  • With the original code, cleanup_output_directory() gets called even when skip_workflow=True
  • This deletes workflow_output.json before the dataset handler can read it

Fix validation (with the change):

  • skip_workflow=True: cleanup_output_directory is correctly NOT called, workflow_output.json survives
  • skip_workflow=False: cleanup_output_directory IS called as expected (normal behavior preserved)

Also ran the full eval test suite (test_evaluate.py) - all 23 tests pass including both parametrized test_run_and_evaluate[True] and test_run_and_evaluate[False].

@willkill07 willkill07 added bug Something isn't working non-breaking Non-breaking change labels Feb 21, 2026
@willkill07 willkill07 self-assigned this Feb 21, 2026
Use nested conditional for clearer logic flow.

Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>
@willkill07
Copy link
Member

/ok to test 7e86fde

@willkill07
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 351a943 into NVIDIA:develop Feb 22, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nat eval --skip_workflow can delete existing trajectories

2 participants