Skip to content

Allow disabling DAG retry policy#2018

Merged
yottahmd merged 4 commits intomainfrom
issue-2016-allow-disabling-all-kinds-of-retries-for
Apr 20, 2026
Merged

Allow disabling DAG retry policy#2018
yottahmd merged 4 commits intomainfrom
issue-2016-allow-disabling-all-kinds-of-retries-for

Conversation

@yottahmd
Copy link
Copy Markdown
Collaborator

@yottahmd yottahmd commented Apr 20, 2026

Summary

  • allow root retry_policy.limit: 0 to disable scheduler-issued DAG retries
  • replace inherited base-config DAG retry policies wholesale so child DAGs can override inherited retries with limit: 0
  • update DAG JSON schema wording, schema tests, loader tests, retry scanner tests, integration coverage, and the bundled DAG skill schema reference

Closes #2016

User-facing docs

  • Updated separately in dagucloud/docs on main: d03c0b6 docs: clarify disabled DAG retries

Testing

  • go test ./internal/core/spec ./internal/cmn/schema ./internal/core/exec ./internal/service/scheduler -count=1
  • go test ./internal/cmn/schema -count=1
  • go test ./internal/intg/queue -run 'TestSchedulerRetryScanner/DisabledByChildSkipsInheritedBaseRetryPolicy' -count=1
  • go test ./internal/intg/queue -run TestSchedulerRetryScanner -count=1
  • pnpm build in ./docs
  • git diff --check in the main repo and ./docs

Summary by CodeRabbit

Release Notes

  • New Features

    • DAG-level automatic retries can now be disabled by setting limit: 0 in the retry policy configuration.
  • Improvements

    • Enhanced retry policy validation to properly support zero limits for disabling retries.
  • Documentation

    • Added schema documentation for DAG-level retry policy configuration, including behavior when limit: 0.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

This PR enables DAG-level automatic retries to be disabled by allowing retry_policy.limit: 0. Changes include updating the JSON schema constraints to permit zero values, modifying parsing logic to conditionally accept zero, adjusting merge behavior to overwrite retry policies wholesale, and adding comprehensive test coverage across schema validation, spec loading, execution, and scheduling layers.

Changes

Cohort / File(s) Summary
Schema Definition
internal/cmn/schema/dag.schema.json
Updated dagRetryPolicy.limit to allow minimum: 0 for integer variant and stricter ^[0-9]+$ pattern for string variant. Updated field description to clarify 0 disables automatic DAG-level retries.
Schema Validation Tests
internal/cmn/schema/dag_schema_test.go
Added 66 lines of test cases validating schema constraints for retry_policy.limit, including zero values (numeric and string), negative values, and non-numeric strings; also validates interval_sec and max_interval_sec constraints.
Spec Parsing Logic
internal/core/spec/dag.go
Added allowZero parameter to parseConcreteDAGRetryInt to conditionally allow zero. Updated call sites: retry_policy.limit now allows zero (allowZero=true), while interval_sec and max_interval_sec reject zero (allowZero=false).
DAG Loading & Merging
internal/core/spec/loader.go, internal/core/spec/loader_test.go
Added custom mergo transformer for core.DAGRetryPolicy to overwrite entire retry policy object instead of merging fields individually, enabling child DAGs to disable inherited retry policies via limit: 0. Added 126 lines of test cases for inheritance and normalization scenarios.
Execution Status
internal/core/exec/runstatus_test.go
Added test case TestInitialStatusSnapshotsDisabledDAGRetryPolicy verifying that disabled retry policy (Limit: 0) is correctly snapshotted into execution status.
Retry Scheduling
internal/service/scheduler/retry_scanner_test.go
Added test coverage for zero-limit retry policy behavior, ensuring disabled retries (limit: 0) skip enqueuing and maintain zero retry counts.
Documentation
skills/dagu/references/schema.md
Added documentation for top-level retry_policy field, including description of the limit: 0 disable behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Allow disabling DAG retry policy' accurately summarizes the main change: enabling retry_policy.limit: 0 to disable scheduler-issued DAG retries, which is the primary objective of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-2016-allow-disabling-all-kinds-of-retries-for

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/cmn/schema/dag.schema.json (1)

890-895: ⚠️ Potential issue | 🟡 Minor

Keep step retry-policy wording step-specific.

stepRetryPolicy is used for defaults.retry_policy and per-step retry_policy, so this description currently tells users that a step retry limit controls DAG-level scheduler retries. That can mislead schema-driven docs/editors.

📝 Proposed wording fix
         "limit": {
           "oneOf": [
             { "type": "integer" },
             { "type": "string" }
           ],
-          "description": "Maximum number of scheduler-issued DAG retry attempts. Use 0 to disable DAG-level automatic retries."
+          "description": "Maximum number of retry attempts for a failed step."
         },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/cmn/schema/dag.schema.json` around lines 890 - 895, The description
for the "limit" property currently implies DAG-level scheduler retry behavior;
update it to be step-specific by clarifying that this is the maximum number of
retry attempts for an individual step's retry policy (used by
defaults.retry_policy and per-step retry_policy), and note that a value of 0
disables automatic retries for that step—refer to the "limit" property inside
stepRetryPolicy to locate the text to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/cmn/schema/dag.schema.json`:
- Around line 890-895: The description for the "limit" property currently
implies DAG-level scheduler retry behavior; update it to be step-specific by
clarifying that this is the maximum number of retry attempts for an individual
step's retry policy (used by defaults.retry_policy and per-step retry_policy),
and note that a value of 0 disables automatic retries for that step—refer to the
"limit" property inside stepRetryPolicy to locate the text to change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0758face-aeec-49d8-a42c-2e62e59a0a8c

📥 Commits

Reviewing files that changed from the base of the PR and between 2ba579a and 8a92813.

📒 Files selected for processing (8)
  • internal/cmn/schema/dag.schema.json
  • internal/cmn/schema/dag_schema_test.go
  • internal/core/exec/runstatus_test.go
  • internal/core/spec/dag.go
  • internal/core/spec/loader.go
  • internal/core/spec/loader_test.go
  • internal/service/scheduler/retry_scanner_test.go
  • skills/dagu/references/schema.md

@yottahmd yottahmd merged commit eefbef9 into main Apr 20, 2026
16 of 17 checks passed
@yottahmd yottahmd deleted the issue-2016-allow-disabling-all-kinds-of-retries-for branch April 20, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow disabling all kinds of retries for a DAG

1 participant