Add Git Sparse Checkout to Git Dag Bundle#67047
Merged
potiuk merged 3 commits intoMay 17, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds optional Git sparse-checkout support to GitDagBundle so monorepo users can materialize only selected directories in the working tree (cone mode). When sparse_dirs is provided, the local clone from the bare repo is performed with --sparse --no-checkout, followed by git sparse-checkout init --cone / set <dirs> and a checkout of the tracking ref. The initial bare clone is still full; the PR description acknowledges this as a future improvement.
Changes:
- New
sparse_dirs: list[str] | Nonekwarg onGitDagBundle, threaded into log context and the clone logic. _clone_repo_if_requirednow conditionally adds--sparse --no-checkoutclone options and configures cone-mode sparse checkout.- Docs example updated to mention
sparse_dirs; new unit test verifying that only files under the configured sparse dir are present.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| providers/git/src/airflow/providers/git/bundles/git.py | Adds sparse_dirs parameter and sparse-checkout setup after cloning from the bare repo. |
| providers/git/docs/bundles/index.rst | Documents the new sparse_dirs kwarg in the JSON config example. |
| providers/git/tests/unit/git/bundles/test_git.py | Adds test_sparse_checkout and type annotations to the git_repo fixture. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
f991e18 to
834db3d
Compare
potiuk
approved these changes
May 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Git Dag Budle does currently always a full clone of the Git repo. Not good in cases when you run Airflow on a big monorepo.
This PR adds support for Git Sparse Checkout.
I experimented a bit around, initial clone is still large as if first bare close in made with
--filter=blob:nonewhich would be optimal, then the local clones of the bare are not able to resolve the object SHAs and miss reference. Due to structure with clone from bare clone I have no idea to slim initial clone down. Might be future improvement for a Git expert. Or strategy with clone from bare clone need to be revised.Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.