Skip to content

Conversation

@divyanshu-tiwari
Copy link
Contributor

Description

Added documentation for DAG feature.

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation and I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • I have checked downstream dependencies (e.g. ExternalTaskSensors) by searching for DAG name elsewhere in the repo

hladush and others added 2 commits November 27, 2025 11:37
* Initial implementation

* Improvements

* add test cases and validation (#19)

* add test cases and validation

* update validation and test cases

* fix test file path

* add check for leading arrows and fix test cases

* correct arrow count comparison

* remove duplicate test cases

* better group validation function

* split functions and add index to errors

* refactor test cases

* refactor test cases

* make whitespaces vary

* better validation function

* abstract running tasks in concurrently

* do not set output for the leaf tasks

* check bounds and check leading arrow first

* fix typos

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* address PR suggestions

---------

Co-authored-by: Divyanshu Tiwari <33171967+divyanshu-tiwari@users.noreply.github.com>
Co-authored-by: Divyanshu Tiwari <divyanshu20.tiwari@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings November 27, 2025 06:11
@divyanshu-tiwari divyanshu-tiwari merged commit e0d5112 into main Nov 27, 2025
6 checks passed
@divyanshu-tiwari divyanshu-tiwari deleted the dag-documentation branch November 27, 2025 06:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive documentation for the experimental DAG (Directed Acyclic Graph) feature, which enables complex pipeline architectures with parallel processing, branching, and merging capabilities.

Key changes:

  • Added a dedicated DAG_README.md file with detailed documentation covering syntax, patterns, validation rules, best practices, and troubleshooting
  • Updated the main README.md with a brief introduction to DAG functionality and example usage
  • Documented the DAG syntax operators (>> for sequential, [] for parallel) and common execution patterns

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
README.md Added new section introducing the experimental DAG feature with basic syntax examples and a reference to the detailed documentation
DAG_README.md Created comprehensive documentation covering DAG syntax, patterns, configuration examples, validation rules, migration guide, troubleshooting, and best practices

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.


1. **Syntax Errors**
```
Error: invalid DAG groups: error at index X, unmatched closing brace ']' found
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "brace" should be "bracket". The error message refers to ']' which is a bracket, not a brace. Braces are '{}'.

Suggested change
Error: invalid DAG groups: error at index X, unmatched closing brace ']' found
Error: invalid DAG groups: error at index X, unmatched closing bracket ']' found

Copilot uses AI. Check for mistakes.
only_data: true

# DAG definition
dag: read_csv_file >> [split_to_lines, echo] >> convert_from_csv >> echo2
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example DAG read_csv_file >> [split_to_lines, echo] >> convert_from_csv >> echo2 appears to have a logical issue. The split_to_lines task produces multiple records (splits input into lines), but then both it and echo feed into convert_from_csv. This creates a fan-in pattern where convert_from_csv would receive data from two sources, which may not be the intended behavior. Consider clarifying this example or using a more straightforward pattern that better demonstrates the DAG feature without potential confusion about data flow.

Suggested change
dag: read_csv_file >> [split_to_lines, echo] >> convert_from_csv >> echo2
dag: read_csv_file >> split_to_lines >> convert_from_csv >> echo2

Copilot uses AI. Check for mistakes.
Comment on lines +99 to +103
- Sequential: `task1 >> task2 >> task3`
- Parallel: `[task1, task2, task3]`
- Fan-out: `task1 >> [task2, task3]`
- Fan-in: `[task1, task2] >> task3`
- Diamond: `task1 >> [task2, task3] >> task4`
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent list formatting. The other bullet points in this section use - with a space after it, but some patterns like "Sequential", "Parallel", etc. are formatted as - Sequential: (with a colon). For consistency with the rest of the document's style, consider whether these should follow the same formatting pattern as other list items in the README.

Suggested change
- Sequential: `task1 >> task2 >> task3`
- Parallel: `[task1, task2, task3]`
- Fan-out: `task1 >> [task2, task3]`
- Fan-in: `[task1, task2] >> task3`
- Diamond: `task1 >> [task2, task3] >> task4`
- **Sequential** `task1 >> task2 >> task3`
- **Parallel** `[task1, task2, task3]`
- **Fan-out** `task1 >> [task2, task3]`
- **Fan-in** `[task1, task2] >> task3`
- **Diamond** `task1 >> [task2, task3] >> task4`

Copilot uses AI. Check for mistakes.
- **No empty groups**: `[]` is invalid
- **No single-item groups**: `[task1]` is invalid (use `task1` directly)
- **Valid characters only**: Letters, numbers, `_`, `-`, `[`, `]`, `,`, `>`, whitespace
- **Proper arrow usage**: Only `>>` allowed, no single `>` or `>>>+`
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent notation: The rule states "no >>>+" which is unclear. It likely means "no >>> or more" (three or more consecutive > characters), but the + notation typically means "one or more" in regex, which would contradict the intended meaning. Consider clarifying this as "no >>> (three or more > characters)" or "only >> allowed (not >, >>>, etc.)".

Suggested change
- **Proper arrow usage**: Only `>>` allowed, no single `>` or `>>>+`
- **Proper arrow usage**: Only `>>` allowed; no single `>` or three or more `>` characters (e.g., `>>>`)

Copilot uses AI. Check for mistakes.
tasks:
- name: task1
type: file
- name: task2
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace detected after "task2". This should be removed for consistency with code style standards.

Suggested change
- name: task2
- name: task2

Copilot uses AI. Check for mistakes.
tasks:
- name: task1
type: file
- name: task2
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace detected after "task2". This should be removed for consistency with code style standards.

Suggested change
- name: task2
- name: task2

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants