Skip to content

[v3-2-test] Isolate non-provider mypy hooks per distribution with ded…#65549

Merged
potiuk merged 1 commit intoapache:v3-2-testfrom
potiuk:backport-4f3b228-v3-2-test
Apr 20, 2026
Merged

[v3-2-test] Isolate non-provider mypy hooks per distribution with ded…#65549
potiuk merged 1 commit intoapache:v3-2-testfrom
potiuk:backport-4f3b228-v3-2-test

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented Apr 20, 2026

…icated .build/ venvs (#65492)

  • Isolate mypy prek hooks, cover all non-provider dirs, and clean up type errors

Each non-provider mypy prek hook now builds and caches its own virtualenv at .build/mypy-venvs// and its own mypy cache at .build/mypy-caches//. UV_PROJECT_ENVIRONMENT redirects uv away from the project's .venv, so running the hook never mutates a contributor's regular development environment while still matching CI's frozen dependency set. Mypy runs with --follow-imports=silent so each hook only reports errors for files it owns; transitive code is covered by its own hook and different venvs no longer produce divergent results on shared code.

Adds mypy hooks for the non-provider directories that were previously uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared. The mypy-shared hook iterates every shared/ workspace distribution and builds a separate venv + cache per distribution so each shared library is type-checked against its own dependency set.

breeze down --cleanup-mypy-cache additionally removes .build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is wiped alongside the existing .mypy_cache and mypy-cache-volume.

Also fixes pre-existing type errors surfaced by the newly added and cleaned-up checks: platform-specific ignores for Linux-only os.posix_fadvise in the shared logging helper, narrower types and type: ignore where appropriate in shared configuration/observability/ timezones/secrets_backend/secrets_masker, Liskov override markers on the AirflowConfigParser subclass methods, and small correctness fixes in dev/breeze and the docker-tests / kubernetes-tests helpers so the full non-provider mypy suite runs clean on macOS and in CI.

  • Move mypy prek hooks to their respective distribution configs

The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, and kubernetes-tests now live in each distribution's own .pre-commit-config.yaml, matching the pattern already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit- config.yaml files are added to distributions that didn't have one. prek auto-discovers nested configs, so the hooks remain part of the default check set.

mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay at the repo root: dev/scripts/devel-common don't have their own configs, and mypy-shared iterates every shared/ distribution so has no single home.

  • Split mypy-dev and mypy-scripts, each with its own pyproject.toml config

Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in a single invocation under the dev project's virtualenv. The two now get independent hooks — mypy-dev in dev/.pre-commit-config.yaml and mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its own dependency set and check its own folder.

Copy the full [tool.mypy] section from the root pyproject.toml into both dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its mypy configuration. Paths inside mypy_path are rewritten from $MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve to the repo-root siblings from the sub-project location. The decorator/ outputs plugins are scoped to dev only (scripts does not author DAG code).

mypy_local_folder.py now passes --config-file /pyproject.toml when the folder maps to one of these sub-project configs, so mypy uses the sub-project's configuration rather than the root one.

  • Teach selective-checks about the new non-provider mypy hooks

Add FileGroupForCi entries and regex patterns for helm-tests, airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared Python files, then wire them into skip_prek_hooks so the corresponding mypy-* prek hook is only kept when its folder changed:

  • mypy-scripts (split off from the old combined mypy-dev)
  • mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests, mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests
  • mypy-shared

Update test_selective_checks.py skip-list constants and per-case inline skip lists to include the new hooks. Targeted test cases for files under the new-hook directories override skip-prek-hooks to leave the matching hook out of the skip set, confirming it will run when its folder changes.

  • Trim dev/scripts pyproject mypy_path to just relevant distributions

Drop the 200+ provider path entries that were blindly copied from the root pyproject.toml. dev and scripts only import from other non-provider workspace members, so listing every provider src/tests directory under mypy_path just adds noise. The remaining non-provider entries cover everything dev or scripts plausibly import from.

  • Install mypy into per-hook venvs from uv.lock via a mypy dep group

Each non-provider distribution with a mypy prek hook now declares a mypy dependency group in its pyproject.toml resolving to apache-airflow-devel-common[mypy]. mypy_local_folder.py syncs each dedicated virtualenv with uv sync --frozen --project <X> --group mypy and runs mypy with uv run --frozen --project <X> --group mypy — so mypy and its type stubs come from the workspace uv.lock, not from an ephemeral --with overlay whose resolution is independent of the main lockfile. uv.lock is refreshed to include the new group.

Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts, airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration- tests, docker-tests, kubernetes-tests, and every shared/ workspace member.

  • Drop mypy_path from dev/scripts pyprojects — venv site-packages is enough

After the switch to installing mypy (and every transitive workspace dependency) directly into each hook's virtualenv via the mypy dep group, workspace packages like airflow, airflow.sdk, airflowctl, airflow_breeze, tests_common are all available via the venv's site-packages. mypy resolves them without needing mypy_path entries, so drop the copied list and leave a short comment explaining why.

  • Split mypy-shared into per-distribution hooks and enforce the pattern

Each shared/ workspace member now owns a mypy-shared- prek hook backed by its own shared//.pre-commit-config.yaml. The single mypy-shared iterator is gone — mypy_local_folder.py accepts shared/ as a first-class folder and the per-hook virtualenv now lives at .build/mypy-venvs/shared-/ (slash in the folder name is replaced with a dash in the venv/cache path).

Adds a new check-shared-mypy-hooks prek hook that fails when a shared/ workspace member is missing its dedicated .pre-commit- config.yaml, printing the exact YAML to add. Selective-checks emits one skip entry per dist, enumerated from shared/ at run time. Contributing docs cover the two-step process for adding a new shared library.

  • Pin minimum_prek_version to 0.3.4 consistently across all configs

All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version already declared by the root config). Previously the nested configs pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the root's version check and still trip on stale subproject pins as they moved between directories.

  • Refresh uv.lock after rebase to reflect the mypy dep groups

The rebase onto main resolved the uv.lock conflict by taking main's version, so uv sync --group mypy would fail against uv.lock until the groups added to the per-distribution pyprojects were re-resolved. Regenerates the lockfile to include them.

  • Add explicit selective-checks test for per-shared-dist mypy hook skipping

Verifies that when a file under shared/logging/ changes, only mypy-shared-logging is kept among the thirteen mypy-shared-* hooks; all other shared distributions' hooks land in the skip list. Pins the contract that the runtime enumeration over shared/*/pyproject.toml works as intended.

  • Refresh mypy docs to match the per-hook venv + --group mypy workflow

Fills in the docs that still referenced the pre-split workflow:

  • AGENTS.md: mentions mypy-shared-<dist> per shared workspace member and the uv sync --group mypy install path for mypy itself.
  • scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run locally through mypy_local_folder.py (Breeze image only needed for the providers hook).
  • dev/breeze/doc/03_developer_tasks.rst: renames stale mypy-airflow to mypy-airflow-core, and expands the cache note to cover the per-hook virtualenvs and caches under .build/.
  • dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl, the six test-dir hooks, and mypy-shared- enumerated at runtime) is documented.
  • Rename mypy_local_folder.py to run_mypy_full_dist_local_venv_or_breeze_in_ci.py

Updates every .pre-commit-config.yaml entry and prose references so they point at the new script name. Two shared configs use YAML folded-scalar entries to stay under the 110-char yamllint limit; updates the validation script's expected template to match.
(cherry picked from commit 4f3b228)


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

…icated .build/ venvs (apache#65492)

* Isolate mypy prek hooks, cover all non-provider dirs, and clean up type errors

Each non-provider mypy prek hook now builds and caches its own virtualenv
at .build/mypy-venvs/<hook>/ and its own mypy cache at
.build/mypy-caches/<hook>/. UV_PROJECT_ENVIRONMENT redirects uv away from
the project's .venv, so running the hook never mutates a contributor's
regular development environment while still matching CI's frozen
dependency set. Mypy runs with --follow-imports=silent so each hook only
reports errors for files it owns; transitive code is covered by its own
hook and different venvs no longer produce divergent results on shared
code.

Adds mypy hooks for the non-provider directories that were previously
uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared.
The mypy-shared hook iterates every shared/<dist> workspace distribution
and builds a separate venv + cache per distribution so each shared
library is type-checked against its own dependency set.

breeze down --cleanup-mypy-cache additionally removes
.build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is
wiped alongside the existing .mypy_cache and mypy-cache-volume.

Also fixes pre-existing type errors surfaced by the newly added and
cleaned-up checks: platform-specific ignores for Linux-only
os.posix_fadvise in the shared logging helper, narrower types and
type: ignore where appropriate in shared configuration/observability/
timezones/secrets_backend/secrets_masker, Liskov override markers on
the AirflowConfigParser subclass methods, and small correctness fixes
in dev/breeze and the docker-tests / kubernetes-tests helpers so the
full non-provider mypy suite runs clean on macOS and in CI.

* Move mypy prek hooks to their respective distribution configs

The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, and kubernetes-tests now live
in each distribution's own .pre-commit-config.yaml, matching the pattern
already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit-
config.yaml files are added to distributions that didn't have one. prek
auto-discovers nested configs, so the hooks remain part of the default
check set.

mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay
at the repo root: dev/scripts/devel-common don't have their own configs,
and mypy-shared iterates every shared/<dist> distribution so has no
single home.

* Split mypy-dev and mypy-scripts, each with its own pyproject.toml config

Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in
a single invocation under the dev project's virtualenv. The two now get
independent hooks — mypy-dev in dev/.pre-commit-config.yaml and
mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its
own dependency set and check its own folder.

Copy the full [tool.mypy] section from the root pyproject.toml into both
dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its
mypy configuration. Paths inside mypy_path are rewritten from
$MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve
to the repo-root siblings from the sub-project location. The decorator/
outputs plugins are scoped to dev only (scripts does not author DAG code).

mypy_local_folder.py now passes --config-file <project>/pyproject.toml
when the folder maps to one of these sub-project configs, so mypy uses
the sub-project's configuration rather than the root one.

* Teach selective-checks about the new non-provider mypy hooks

Add FileGroupForCi entries and regex patterns for helm-tests,
airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared
Python files, then wire them into skip_prek_hooks so the corresponding
mypy-* prek hook is only kept when its folder changed:

- mypy-scripts (split off from the old combined mypy-dev)
- mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests,
  mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests
- mypy-shared

Update test_selective_checks.py skip-list constants and per-case inline
skip lists to include the new hooks. Targeted test cases for files under
the new-hook directories override skip-prek-hooks to leave the matching
hook out of the skip set, confirming it will run when its folder changes.

* Trim dev/scripts pyproject mypy_path to just relevant distributions

Drop the 200+ provider path entries that were blindly copied from the root
pyproject.toml. dev and scripts only import from other non-provider
workspace members, so listing every provider src/tests directory under
mypy_path just adds noise. The remaining non-provider entries cover
everything dev or scripts plausibly import from.

* Install mypy into per-hook venvs from uv.lock via a `mypy` dep group

Each non-provider distribution with a mypy prek hook now declares a
`mypy` dependency group in its pyproject.toml resolving to
`apache-airflow-devel-common[mypy]`. mypy_local_folder.py syncs each
dedicated virtualenv with `uv sync --frozen --project <X> --group mypy`
and runs mypy with `uv run --frozen --project <X> --group mypy` — so
mypy and its type stubs come from the workspace uv.lock, not from an
ephemeral `--with` overlay whose resolution is independent of the main
lockfile. uv.lock is refreshed to include the new group.

Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts,
airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-
tests, docker-tests, kubernetes-tests, and every shared/<dist>
workspace member.

* Drop mypy_path from dev/scripts pyprojects — venv site-packages is enough

After the switch to installing mypy (and every transitive workspace
dependency) directly into each hook's virtualenv via the `mypy` dep
group, workspace packages like airflow, airflow.sdk, airflowctl,
airflow_breeze, tests_common are all available via the venv's
site-packages. mypy resolves them without needing mypy_path entries,
so drop the copied list and leave a short comment explaining why.

* Split mypy-shared into per-distribution hooks and enforce the pattern

Each shared/<dist> workspace member now owns a mypy-shared-<dist> prek hook
backed by its own shared/<dist>/.pre-commit-config.yaml. The single
mypy-shared iterator is gone — mypy_local_folder.py accepts shared/<dist>
as a first-class folder and the per-hook virtualenv now lives at
.build/mypy-venvs/shared-<dist>/ (slash in the folder name is replaced
with a dash in the venv/cache path).

Adds a new check-shared-mypy-hooks prek hook that fails when a
shared/<dist> workspace member is missing its dedicated .pre-commit-
config.yaml, printing the exact YAML to add. Selective-checks emits one
skip entry per dist, enumerated from shared/ at run time. Contributing
docs cover the two-step process for adding a new shared library.

* Pin minimum_prek_version to 0.3.4 consistently across all configs

All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version
already declared by the root config). Previously the nested configs
pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the
root's version check and still trip on stale subproject pins as they
moved between directories.

* Refresh uv.lock after rebase to reflect the `mypy` dep groups

The rebase onto main resolved the uv.lock conflict by taking main's
version, so `uv sync --group mypy` would fail against uv.lock until
the groups added to the per-distribution pyprojects were re-resolved.
Regenerates the lockfile to include them.

* Add explicit selective-checks test for per-shared-dist mypy hook skipping

Verifies that when a file under shared/logging/ changes, only
mypy-shared-logging is kept among the thirteen mypy-shared-* hooks;
all other shared distributions' hooks land in the skip list. Pins the
contract that the runtime enumeration over shared/*/pyproject.toml
works as intended.

* Refresh mypy docs to match the per-hook venv + --group mypy workflow

Fills in the docs that still referenced the pre-split workflow:

- AGENTS.md: mentions `mypy-shared-<dist>` per shared workspace member and
  the `uv sync --group mypy` install path for mypy itself.
- scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run
  locally through mypy_local_folder.py (Breeze image only needed for the
  providers hook).
- dev/breeze/doc/03_developer_tasks.rst: renames stale `mypy-airflow` to
  `mypy-airflow-core`, and expands the cache note to cover the per-hook
  virtualenvs and caches under .build/.
- dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and
  skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl,
  the six test-dir hooks, and mypy-shared-<dist> enumerated at runtime)
  is documented.

* Rename mypy_local_folder.py to run_mypy_full_dist_local_venv_or_breeze_in_ci.py

Updates every .pre-commit-config.yaml entry and prose references so they
point at the new script name. Two shared configs use YAML folded-scalar
entries to stay under the 110-char yamllint limit; updates the validation
script's expected template to match.
(cherry picked from commit 4f3b228)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@potiuk potiuk force-pushed the backport-4f3b228-v3-2-test branch from 532f07a to f5e91e9 Compare April 20, 2026 22:58
@potiuk potiuk merged commit 59fbbc6 into apache:v3-2-test Apr 20, 2026
105 checks passed
@potiuk potiuk deleted the backport-4f3b228-v3-2-test branch April 20, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant