[v3-2-test] Isolate non-provider mypy hooks per distribution with ded…#65549
Merged
potiuk merged 1 commit intoapache:v3-2-testfrom Apr 20, 2026
Merged
[v3-2-test] Isolate non-provider mypy hooks per distribution with ded…#65549potiuk merged 1 commit intoapache:v3-2-testfrom
potiuk merged 1 commit intoapache:v3-2-testfrom
Conversation
…icated .build/ venvs (apache#65492) * Isolate mypy prek hooks, cover all non-provider dirs, and clean up type errors Each non-provider mypy prek hook now builds and caches its own virtualenv at .build/mypy-venvs/<hook>/ and its own mypy cache at .build/mypy-caches/<hook>/. UV_PROJECT_ENVIRONMENT redirects uv away from the project's .venv, so running the hook never mutates a contributor's regular development environment while still matching CI's frozen dependency set. Mypy runs with --follow-imports=silent so each hook only reports errors for files it owns; transitive code is covered by its own hook and different venvs no longer produce divergent results on shared code. Adds mypy hooks for the non-provider directories that were previously uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared. The mypy-shared hook iterates every shared/<dist> workspace distribution and builds a separate venv + cache per distribution so each shared library is type-checked against its own dependency set. breeze down --cleanup-mypy-cache additionally removes .build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is wiped alongside the existing .mypy_cache and mypy-cache-volume. Also fixes pre-existing type errors surfaced by the newly added and cleaned-up checks: platform-specific ignores for Linux-only os.posix_fadvise in the shared logging helper, narrower types and type: ignore where appropriate in shared configuration/observability/ timezones/secrets_backend/secrets_masker, Liskov override markers on the AirflowConfigParser subclass methods, and small correctness fixes in dev/breeze and the docker-tests / kubernetes-tests helpers so the full non-provider mypy suite runs clean on macOS and in CI. * Move mypy prek hooks to their respective distribution configs The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, and kubernetes-tests now live in each distribution's own .pre-commit-config.yaml, matching the pattern already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit- config.yaml files are added to distributions that didn't have one. prek auto-discovers nested configs, so the hooks remain part of the default check set. mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay at the repo root: dev/scripts/devel-common don't have their own configs, and mypy-shared iterates every shared/<dist> distribution so has no single home. * Split mypy-dev and mypy-scripts, each with its own pyproject.toml config Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in a single invocation under the dev project's virtualenv. The two now get independent hooks — mypy-dev in dev/.pre-commit-config.yaml and mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its own dependency set and check its own folder. Copy the full [tool.mypy] section from the root pyproject.toml into both dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its mypy configuration. Paths inside mypy_path are rewritten from $MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve to the repo-root siblings from the sub-project location. The decorator/ outputs plugins are scoped to dev only (scripts does not author DAG code). mypy_local_folder.py now passes --config-file <project>/pyproject.toml when the folder maps to one of these sub-project configs, so mypy uses the sub-project's configuration rather than the root one. * Teach selective-checks about the new non-provider mypy hooks Add FileGroupForCi entries and regex patterns for helm-tests, airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared Python files, then wire them into skip_prek_hooks so the corresponding mypy-* prek hook is only kept when its folder changed: - mypy-scripts (split off from the old combined mypy-dev) - mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests, mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests - mypy-shared Update test_selective_checks.py skip-list constants and per-case inline skip lists to include the new hooks. Targeted test cases for files under the new-hook directories override skip-prek-hooks to leave the matching hook out of the skip set, confirming it will run when its folder changes. * Trim dev/scripts pyproject mypy_path to just relevant distributions Drop the 200+ provider path entries that were blindly copied from the root pyproject.toml. dev and scripts only import from other non-provider workspace members, so listing every provider src/tests directory under mypy_path just adds noise. The remaining non-provider entries cover everything dev or scripts plausibly import from. * Install mypy into per-hook venvs from uv.lock via a `mypy` dep group Each non-provider distribution with a mypy prek hook now declares a `mypy` dependency group in its pyproject.toml resolving to `apache-airflow-devel-common[mypy]`. mypy_local_folder.py syncs each dedicated virtualenv with `uv sync --frozen --project <X> --group mypy` and runs mypy with `uv run --frozen --project <X> --group mypy` — so mypy and its type stubs come from the workspace uv.lock, not from an ephemeral `--with` overlay whose resolution is independent of the main lockfile. uv.lock is refreshed to include the new group. Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts, airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration- tests, docker-tests, kubernetes-tests, and every shared/<dist> workspace member. * Drop mypy_path from dev/scripts pyprojects — venv site-packages is enough After the switch to installing mypy (and every transitive workspace dependency) directly into each hook's virtualenv via the `mypy` dep group, workspace packages like airflow, airflow.sdk, airflowctl, airflow_breeze, tests_common are all available via the venv's site-packages. mypy resolves them without needing mypy_path entries, so drop the copied list and leave a short comment explaining why. * Split mypy-shared into per-distribution hooks and enforce the pattern Each shared/<dist> workspace member now owns a mypy-shared-<dist> prek hook backed by its own shared/<dist>/.pre-commit-config.yaml. The single mypy-shared iterator is gone — mypy_local_folder.py accepts shared/<dist> as a first-class folder and the per-hook virtualenv now lives at .build/mypy-venvs/shared-<dist>/ (slash in the folder name is replaced with a dash in the venv/cache path). Adds a new check-shared-mypy-hooks prek hook that fails when a shared/<dist> workspace member is missing its dedicated .pre-commit- config.yaml, printing the exact YAML to add. Selective-checks emits one skip entry per dist, enumerated from shared/ at run time. Contributing docs cover the two-step process for adding a new shared library. * Pin minimum_prek_version to 0.3.4 consistently across all configs All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version already declared by the root config). Previously the nested configs pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the root's version check and still trip on stale subproject pins as they moved between directories. * Refresh uv.lock after rebase to reflect the `mypy` dep groups The rebase onto main resolved the uv.lock conflict by taking main's version, so `uv sync --group mypy` would fail against uv.lock until the groups added to the per-distribution pyprojects were re-resolved. Regenerates the lockfile to include them. * Add explicit selective-checks test for per-shared-dist mypy hook skipping Verifies that when a file under shared/logging/ changes, only mypy-shared-logging is kept among the thirteen mypy-shared-* hooks; all other shared distributions' hooks land in the skip list. Pins the contract that the runtime enumeration over shared/*/pyproject.toml works as intended. * Refresh mypy docs to match the per-hook venv + --group mypy workflow Fills in the docs that still referenced the pre-split workflow: - AGENTS.md: mentions `mypy-shared-<dist>` per shared workspace member and the `uv sync --group mypy` install path for mypy itself. - scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run locally through mypy_local_folder.py (Breeze image only needed for the providers hook). - dev/breeze/doc/03_developer_tasks.rst: renames stale `mypy-airflow` to `mypy-airflow-core`, and expands the cache note to cover the per-hook virtualenvs and caches under .build/. - dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl, the six test-dir hooks, and mypy-shared-<dist> enumerated at runtime) is documented. * Rename mypy_local_folder.py to run_mypy_full_dist_local_venv_or_breeze_in_ci.py Updates every .pre-commit-config.yaml entry and prose references so they point at the new script name. Two shared configs use YAML folded-scalar entries to stay under the 110-char yamllint limit; updates the validation script's expected template to match. (cherry picked from commit 4f3b228) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
532f07a to
f5e91e9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…icated .build/ venvs (#65492)
Each non-provider mypy prek hook now builds and caches its own virtualenv at .build/mypy-venvs// and its own mypy cache at .build/mypy-caches//. UV_PROJECT_ENVIRONMENT redirects uv away from the project's .venv, so running the hook never mutates a contributor's regular development environment while still matching CI's frozen dependency set. Mypy runs with --follow-imports=silent so each hook only reports errors for files it owns; transitive code is covered by its own hook and different venvs no longer produce divergent results on shared code.
Adds mypy hooks for the non-provider directories that were previously uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared. The mypy-shared hook iterates every shared/ workspace distribution and builds a separate venv + cache per distribution so each shared library is type-checked against its own dependency set.
breeze down --cleanup-mypy-cache additionally removes .build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is wiped alongside the existing .mypy_cache and mypy-cache-volume.
Also fixes pre-existing type errors surfaced by the newly added and cleaned-up checks: platform-specific ignores for Linux-only os.posix_fadvise in the shared logging helper, narrower types and type: ignore where appropriate in shared configuration/observability/ timezones/secrets_backend/secrets_masker, Liskov override markers on the AirflowConfigParser subclass methods, and small correctness fixes in dev/breeze and the docker-tests / kubernetes-tests helpers so the full non-provider mypy suite runs clean on macOS and in CI.
The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, and kubernetes-tests now live in each distribution's own .pre-commit-config.yaml, matching the pattern already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit- config.yaml files are added to distributions that didn't have one. prek auto-discovers nested configs, so the hooks remain part of the default check set.
mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay at the repo root: dev/scripts/devel-common don't have their own configs, and mypy-shared iterates every shared/ distribution so has no single home.
Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in a single invocation under the dev project's virtualenv. The two now get independent hooks — mypy-dev in dev/.pre-commit-config.yaml and mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its own dependency set and check its own folder.
Copy the full [tool.mypy] section from the root pyproject.toml into both dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its mypy configuration. Paths inside mypy_path are rewritten from $MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve to the repo-root siblings from the sub-project location. The decorator/ outputs plugins are scoped to dev only (scripts does not author DAG code).
mypy_local_folder.py now passes --config-file /pyproject.toml when the folder maps to one of these sub-project configs, so mypy uses the sub-project's configuration rather than the root one.
Add FileGroupForCi entries and regex patterns for helm-tests, airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared Python files, then wire them into skip_prek_hooks so the corresponding mypy-* prek hook is only kept when its folder changed:
Update test_selective_checks.py skip-list constants and per-case inline skip lists to include the new hooks. Targeted test cases for files under the new-hook directories override skip-prek-hooks to leave the matching hook out of the skip set, confirming it will run when its folder changes.
Drop the 200+ provider path entries that were blindly copied from the root pyproject.toml. dev and scripts only import from other non-provider workspace members, so listing every provider src/tests directory under mypy_path just adds noise. The remaining non-provider entries cover everything dev or scripts plausibly import from.
mypydep groupEach non-provider distribution with a mypy prek hook now declares a
mypydependency group in its pyproject.toml resolving toapache-airflow-devel-common[mypy]. mypy_local_folder.py syncs each dedicated virtualenv withuv sync --frozen --project <X> --group mypyand runs mypy withuv run --frozen --project <X> --group mypy— so mypy and its type stubs come from the workspace uv.lock, not from an ephemeral--withoverlay whose resolution is independent of the main lockfile. uv.lock is refreshed to include the new group.Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts, airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration- tests, docker-tests, kubernetes-tests, and every shared/ workspace member.
After the switch to installing mypy (and every transitive workspace dependency) directly into each hook's virtualenv via the
mypydep group, workspace packages like airflow, airflow.sdk, airflowctl, airflow_breeze, tests_common are all available via the venv's site-packages. mypy resolves them without needing mypy_path entries, so drop the copied list and leave a short comment explaining why.Each shared/ workspace member now owns a mypy-shared- prek hook backed by its own shared//.pre-commit-config.yaml. The single mypy-shared iterator is gone — mypy_local_folder.py accepts shared/ as a first-class folder and the per-hook virtualenv now lives at .build/mypy-venvs/shared-/ (slash in the folder name is replaced with a dash in the venv/cache path).
Adds a new check-shared-mypy-hooks prek hook that fails when a shared/ workspace member is missing its dedicated .pre-commit- config.yaml, printing the exact YAML to add. Selective-checks emits one skip entry per dist, enumerated from shared/ at run time. Contributing docs cover the two-step process for adding a new shared library.
All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version already declared by the root config). Previously the nested configs pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the root's version check and still trip on stale subproject pins as they moved between directories.
mypydep groupsThe rebase onto main resolved the uv.lock conflict by taking main's version, so
uv sync --group mypywould fail against uv.lock until the groups added to the per-distribution pyprojects were re-resolved. Regenerates the lockfile to include them.Verifies that when a file under shared/logging/ changes, only mypy-shared-logging is kept among the thirteen mypy-shared-* hooks; all other shared distributions' hooks land in the skip list. Pins the contract that the runtime enumeration over shared/*/pyproject.toml works as intended.
Fills in the docs that still referenced the pre-split workflow:
mypy-shared-<dist>per shared workspace member and theuv sync --group mypyinstall path for mypy itself.mypy-airflowtomypy-airflow-core, and expands the cache note to cover the per-hook virtualenvs and caches under .build/.Updates every .pre-commit-config.yaml entry and prose references so they point at the new script name. Two shared configs use YAML folded-scalar entries to stay under the 110-char yamllint limit; updates the validation script's expected template to match.
(cherry picked from commit 4f3b228)
Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.