Skip to content

Isolate non-provider mypy hooks per distribution with dedicated .build/ venvs#65492

Merged
potiuk merged 13 commits intoapache:mainfrom
potiuk:mypy-dedicated-venvs
Apr 20, 2026
Merged

Isolate non-provider mypy hooks per distribution with dedicated .build/ venvs#65492
potiuk merged 13 commits intoapache:mainfrom
potiuk:mypy-dedicated-venvs

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented Apr 19, 2026

Summary

Every non-provider mypy prek hook now lives in its own distribution's .pre-commit-config.yaml, runs in a dedicated virtualenv at .build/mypy-venvs/<hook>/ with a dedicated mypy cache at .build/mypy-caches/<hook>/, and installs everything (including mypy itself) from the workspace uv.lock via a per-distribution mypy dep group — no more shared .venv, no more ephemeral --with overlays.

What's new

  • Coverage — adds mypy hooks for the non-provider directories that were previously uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-tests, docker-tests, kubernetes-tests, plus one mypy-shared-<dist> hook per shared/<dist> workspace member (13 total). Each hook lives in its distribution's own .pre-commit-config.yaml — prek auto-discovers nested configs.
  • Split mypy-dev and mypy-scripts — each has its own independent hook, its own dedicated venv, and its own pyproject.toml-level [tool.mypy] configuration. The script passes --config-file <project>/pyproject.toml for these two so their config takes precedence over the root one.
  • Mypy installed from the frozen lock — every distribution with a mypy hook declares mypy = ["apache-airflow-devel-common[mypy]"] in its [dependency-groups]. The script syncs and runs mypy with --frozen --group mypy, so mypy itself, its plugins, and its stubs all resolve against the main workspace uv.lock.
  • --follow-imports=silent — each hook only reports errors for files it owns; transitive code is covered by its own hook, avoiding divergent results across venvs on shared code.
  • breeze down --cleanup-mypy-cache — also wipes .build/mypy-venvs/ and .build/mypy-caches/ alongside the existing .mypy_cache and mypy-cache-volume.
  • Selective-checks — emits the right skip-prek-hooks set for every new hook, enumerating shared/<dist> dists at runtime so no hand-maintenance is needed.
  • Enforcement — a new check-shared-mypy-hooks prek hook fails when a shared/<dist> workspace member is missing its dedicated .pre-commit-config.yaml, printing the exact YAML to paste. Contributing docs describe the two-step process for adding a new shared library.

Type-error fixes

Pre-existing type errors surfaced by the newly added and cleaned-up checks are addressed: platform-specific ignores for Linux-only os.posix_fadvise, narrower types and # type: ignore markers in shared configuration / observability / timezones / secrets_backend / secrets_masker, Liskov override markers on the AirflowConfigParser subclass methods, and small correctness fixes in dev/breeze and the docker-tests / kubernetes-tests helpers. The full non-provider mypy suite now runs clean.


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (claude-opus-4-7)

Generated-by: Claude Code (claude-opus-4-7) following the guidelines

potiuk added 10 commits April 19, 2026 15:51
…pe errors

Each non-provider mypy prek hook now builds and caches its own virtualenv
at .build/mypy-venvs/<hook>/ and its own mypy cache at
.build/mypy-caches/<hook>/. UV_PROJECT_ENVIRONMENT redirects uv away from
the project's .venv, so running the hook never mutates a contributor's
regular development environment while still matching CI's frozen
dependency set. Mypy runs with --follow-imports=silent so each hook only
reports errors for files it owns; transitive code is covered by its own
hook and different venvs no longer produce divergent results on shared
code.

Adds mypy hooks for the non-provider directories that were previously
uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared.
The mypy-shared hook iterates every shared/<dist> workspace distribution
and builds a separate venv + cache per distribution so each shared
library is type-checked against its own dependency set.

breeze down --cleanup-mypy-cache additionally removes
.build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is
wiped alongside the existing .mypy_cache and mypy-cache-volume.

Also fixes pre-existing type errors surfaced by the newly added and
cleaned-up checks: platform-specific ignores for Linux-only
os.posix_fadvise in the shared logging helper, narrower types and
type: ignore where appropriate in shared configuration/observability/
timezones/secrets_backend/secrets_masker, Liskov override markers on
the AirflowConfigParser subclass methods, and small correctness fixes
in dev/breeze and the docker-tests / kubernetes-tests helpers so the
full non-provider mypy suite runs clean on macOS and in CI.
The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, and kubernetes-tests now live
in each distribution's own .pre-commit-config.yaml, matching the pattern
already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit-
config.yaml files are added to distributions that didn't have one. prek
auto-discovers nested configs, so the hooks remain part of the default
check set.

mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay
at the repo root: dev/scripts/devel-common don't have their own configs,
and mypy-shared iterates every shared/<dist> distribution so has no
single home.
Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in
a single invocation under the dev project's virtualenv. The two now get
independent hooks — mypy-dev in dev/.pre-commit-config.yaml and
mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its
own dependency set and check its own folder.

Copy the full [tool.mypy] section from the root pyproject.toml into both
dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its
mypy configuration. Paths inside mypy_path are rewritten from
$MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve
to the repo-root siblings from the sub-project location. The decorator/
outputs plugins are scoped to dev only (scripts does not author DAG code).

mypy_local_folder.py now passes --config-file <project>/pyproject.toml
when the folder maps to one of these sub-project configs, so mypy uses
the sub-project's configuration rather than the root one.
Add FileGroupForCi entries and regex patterns for helm-tests,
airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared
Python files, then wire them into skip_prek_hooks so the corresponding
mypy-* prek hook is only kept when its folder changed:

- mypy-scripts (split off from the old combined mypy-dev)
- mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests,
  mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests
- mypy-shared

Update test_selective_checks.py skip-list constants and per-case inline
skip lists to include the new hooks. Targeted test cases for files under
the new-hook directories override skip-prek-hooks to leave the matching
hook out of the skip set, confirming it will run when its folder changes.
Drop the 200+ provider path entries that were blindly copied from the root
pyproject.toml. dev and scripts only import from other non-provider
workspace members, so listing every provider src/tests directory under
mypy_path just adds noise. The remaining non-provider entries cover
everything dev or scripts plausibly import from.
Each non-provider distribution with a mypy prek hook now declares a
`mypy` dependency group in its pyproject.toml resolving to
`apache-airflow-devel-common[mypy]`. mypy_local_folder.py syncs each
dedicated virtualenv with `uv sync --frozen --project <X> --group mypy`
and runs mypy with `uv run --frozen --project <X> --group mypy` — so
mypy and its type stubs come from the workspace uv.lock, not from an
ephemeral `--with` overlay whose resolution is independent of the main
lockfile. uv.lock is refreshed to include the new group.

Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts,
airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-
tests, docker-tests, kubernetes-tests, and every shared/<dist>
workspace member.
…ough

After the switch to installing mypy (and every transitive workspace
dependency) directly into each hook's virtualenv via the `mypy` dep
group, workspace packages like airflow, airflow.sdk, airflowctl,
airflow_breeze, tests_common are all available via the venv's
site-packages. mypy resolves them without needing mypy_path entries,
so drop the copied list and leave a short comment explaining why.
Each shared/<dist> workspace member now owns a mypy-shared-<dist> prek hook
backed by its own shared/<dist>/.pre-commit-config.yaml. The single
mypy-shared iterator is gone — mypy_local_folder.py accepts shared/<dist>
as a first-class folder and the per-hook virtualenv now lives at
.build/mypy-venvs/shared-<dist>/ (slash in the folder name is replaced
with a dash in the venv/cache path).

Adds a new check-shared-mypy-hooks prek hook that fails when a
shared/<dist> workspace member is missing its dedicated .pre-commit-
config.yaml, printing the exact YAML to add. Selective-checks emits one
skip entry per dist, enumerated from shared/ at run time. Contributing
docs cover the two-step process for adding a new shared library.
All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version
already declared by the root config). Previously the nested configs
pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the
root's version check and still trip on stale subproject pins as they
moved between directories.
The rebase onto main resolved the uv.lock conflict by taking main's
version, so `uv sync --group mypy` would fail against uv.lock until
the groups added to the per-distribution pyprojects were re-resolved.
Regenerates the lockfile to include them.
…ping

Verifies that when a file under shared/logging/ changes, only
mypy-shared-logging is kept among the thirteen mypy-shared-* hooks;
all other shared distributions' hooks land in the skip list. Pins the
contract that the runtime enumeration over shared/*/pyproject.toml
works as intended.
@potiuk potiuk added the skip common compat check Skips common compat provider modification check label Apr 19, 2026
potiuk added 2 commits April 19, 2026 16:08
Fills in the docs that still referenced the pre-split workflow:

- AGENTS.md: mentions `mypy-shared-<dist>` per shared workspace member and
  the `uv sync --group mypy` install path for mypy itself.
- scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run
  locally through mypy_local_folder.py (Breeze image only needed for the
  providers hook).
- dev/breeze/doc/03_developer_tasks.rst: renames stale `mypy-airflow` to
  `mypy-airflow-core`, and expands the cache note to cover the per-hook
  virtualenvs and caches under .build/.
- dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and
  skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl,
  the six test-dir hooks, and mypy-shared-<dist> enumerated at runtime)
  is documented.
…e_in_ci.py

Updates every .pre-commit-config.yaml entry and prose references so they
point at the new script name. Two shared configs use YAML folded-scalar
entries to stay under the 110-char yamllint limit; updates the validation
script's expected template to match.
jscheffl
jscheffl previously approved these changes Apr 19, 2026
Copy link
Copy Markdown
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Seeing also the mypy fixes this shows that the rework was beneficial and found some nit to be fixed.

Looking forward for the same in providers as split!

@jscheffl
Copy link
Copy Markdown
Contributor

Note, local re-run is still running... previously/currently on main I have problems with mypy, same @AutomationDev85 had on his PC on Friday... hope this is fixing mypy errors implicitly.

@jscheffl jscheffl dismissed their stale review April 19, 2026 16:37

Need to re-run local mypy... hold on for merging---

@jscheffl
Copy link
Copy Markdown
Contributor

Wow, almost 1.4 GB temp disk space is needed for all mypy caches :-O

(apache-airflow-breeze) jscheffl@hp860g9:~/Workspace/airflow_review/.build$ du -sm mypy*
615     mypy-caches
781     mypy-venvs

@jscheffl
Copy link
Copy Markdown
Contributor

Interesting, running in Linux and prek run -a git two errors, Python 3.13:

Running hooks for `task-sdk-integration-tests`:
Run mypy for task-sdk-integration-tests...................................................................Failed
- hook id: mypy-task-sdk-integration-tests
- exit code: 2

  Running mypy for folders: ['task-sdk-integration-tests']
  Syncing dedicated mypy virtualenv (/home/jscheffl/Workspace/airflow_review/.build/mypy-venvs/task-sdk-integration-tests) for project task-sdk-integration-tests: uv sync --frozen --project task-sdk-integration-tests --group mypy
  warning: Failed to parse `pyproject.toml` during settings discovery:
    TOML parse error at line 1355, column 17
         |
    1355 | exclude-newer = "4 days"
         |                 ^^^^^^^^
    failed to parse year in date "4 days": failed to parse "4 da" as year (a four digit integer): invalid digit, expected 0-9 but got  
  
  warning: `VIRTUAL_ENV=/home/jscheffl/.cache/prek/hooks/python-NFPi8u10rFeZ5V6tnNNQ` does not match the project environment path `.build/mypy-venvs/task-sdk-integration-tests` and will be ignored; use `--active` to target the active environment instead
  error: Group `mypy` is not defined in the project's `dependency-groups` table
  `uv sync --frozen --project task-sdk-integration-tests` failed for the mypy virtualenv at /home/jscheffl/Workspace/airflow_review/.build/mypy-venvs/task-sdk-integration-tests. Fix the sync error before running mypy — otherwise the dedicated mypy virtualenv will not match uv.lock and results will diverge from CI. You can remove the cached virtualenv with:
    breeze down --cleanup-mypy-cache

  Mypy check failed. You can run mypy locally with:
    prek run mypy-task-sdk-integration-tests --all-files
  The hook uses dedicated virtualenv(s) and mypy cache(s) under .build/ so it does
  not touch your regular project .venv. You can clear both with:
    breeze down --cleanup-mypy-cache

...but running individually via prek run -a mypy-task-sdk-integration-tests is fine. Could reproduce.

Generate Datamodels for TaskSDK client....................................................................Failed
- hook id: generate-tasksdk-datamodels
- files were modified by this hook

  warning: Failed to parse `/home/jscheffl/Workspace/airflow_review/pyproject.toml` during settings discovery:
    TOML parse error at line 1355, column 17
         |
    1355 | exclude-newer = "4 days"
         |                 ^^^^^^^^
    failed to parse year in date "4 days": failed to parse "4 da" as year (a four digit integer): invalid digit, expected 0-9 but got  
  
  Using CPython 3.12.3 interpreter at: /usr/bin/python3.12
  Removed virtual environment at: /home/jscheffl/.cache/prek/hooks/python-YDYfQRMBSDtQiUlNIGnr
  Creating virtual environment at: /home/jscheffl/.cache/prek/hooks/python-YDYfQRMBSDtQiUlNIGnr
  Ignoring existing lockfile due to removal of timestamp cutoff: `2026-04-15T17:09:07.955557547Z`
  Installed 220 packages in 289ms

maybe the same but unrelated to the change.

Willr epeat testing with Python 3.10...

Copy link
Copy Markdown
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with Python 3.10 but nothing that I see is by this PR, so it is improving here... LGTM!
(Will need another PR to fix the other strange things showing on my side...)

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Apr 20, 2026

Wow, almost 1.4 GB temp disk space is needed for all mypy caches :-O

(apache-airflow-breeze) jscheffl@hp860g9:~/Workspace/airflow_review/.build$ du -sm mypy*
615     mypy-caches
781     mypy-venvs

Yeah .. that's a bit of a price to pay for speed.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Apr 20, 2026

Same with Python 3.10 but nothing that I see is by this PR, so it is improving here... LGTM! (Will need another PR to fix the other strange things showing on my side...)

You need to update uv to later version. The problem is that "min_uv_version" is there, but 4 days cooldown causes it to ... well ... fail even parsing the main version 😱

However. I think I can improve that - and use the uv that is already installed in the local .venv - this one should always be good one - no matter which uv version you have on your path.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Apr 20, 2026

Better way of handling uv versions is coming.

@potiuk potiuk merged commit 4f3b228 into apache:main Apr 20, 2026
143 checks passed
@potiuk potiuk deleted the mypy-dedicated-venvs branch April 20, 2026 11:04
@github-actions
Copy link
Copy Markdown
Contributor

Backport failed to create: v3-2-test. View the failure log Run details

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 4f3b228 v3-2-test

This should apply the commit to the v3-2-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Apr 20, 2026

Same with Python 3.10 but nothing that I see is by this PR, so it is improving here... LGTM! (Will need another PR to fix the other strange things showing on my side...)

#65531

potiuk added a commit to potiuk/airflow that referenced this pull request Apr 20, 2026
…icated .build/ venvs (apache#65492)

* Isolate mypy prek hooks, cover all non-provider dirs, and clean up type errors

Each non-provider mypy prek hook now builds and caches its own virtualenv
at .build/mypy-venvs/<hook>/ and its own mypy cache at
.build/mypy-caches/<hook>/. UV_PROJECT_ENVIRONMENT redirects uv away from
the project's .venv, so running the hook never mutates a contributor's
regular development environment while still matching CI's frozen
dependency set. Mypy runs with --follow-imports=silent so each hook only
reports errors for files it owns; transitive code is covered by its own
hook and different venvs no longer produce divergent results on shared
code.

Adds mypy hooks for the non-provider directories that were previously
uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared.
The mypy-shared hook iterates every shared/<dist> workspace distribution
and builds a separate venv + cache per distribution so each shared
library is type-checked against its own dependency set.

breeze down --cleanup-mypy-cache additionally removes
.build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is
wiped alongside the existing .mypy_cache and mypy-cache-volume.

Also fixes pre-existing type errors surfaced by the newly added and
cleaned-up checks: platform-specific ignores for Linux-only
os.posix_fadvise in the shared logging helper, narrower types and
type: ignore where appropriate in shared configuration/observability/
timezones/secrets_backend/secrets_masker, Liskov override markers on
the AirflowConfigParser subclass methods, and small correctness fixes
in dev/breeze and the docker-tests / kubernetes-tests helpers so the
full non-provider mypy suite runs clean on macOS and in CI.

* Move mypy prek hooks to their respective distribution configs

The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, and kubernetes-tests now live
in each distribution's own .pre-commit-config.yaml, matching the pattern
already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit-
config.yaml files are added to distributions that didn't have one. prek
auto-discovers nested configs, so the hooks remain part of the default
check set.

mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay
at the repo root: dev/scripts/devel-common don't have their own configs,
and mypy-shared iterates every shared/<dist> distribution so has no
single home.

* Split mypy-dev and mypy-scripts, each with its own pyproject.toml config

Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in
a single invocation under the dev project's virtualenv. The two now get
independent hooks — mypy-dev in dev/.pre-commit-config.yaml and
mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its
own dependency set and check its own folder.

Copy the full [tool.mypy] section from the root pyproject.toml into both
dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its
mypy configuration. Paths inside mypy_path are rewritten from
$MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve
to the repo-root siblings from the sub-project location. The decorator/
outputs plugins are scoped to dev only (scripts does not author DAG code).

mypy_local_folder.py now passes --config-file <project>/pyproject.toml
when the folder maps to one of these sub-project configs, so mypy uses
the sub-project's configuration rather than the root one.

* Teach selective-checks about the new non-provider mypy hooks

Add FileGroupForCi entries and regex patterns for helm-tests,
airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared
Python files, then wire them into skip_prek_hooks so the corresponding
mypy-* prek hook is only kept when its folder changed:

- mypy-scripts (split off from the old combined mypy-dev)
- mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests,
  mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests
- mypy-shared

Update test_selective_checks.py skip-list constants and per-case inline
skip lists to include the new hooks. Targeted test cases for files under
the new-hook directories override skip-prek-hooks to leave the matching
hook out of the skip set, confirming it will run when its folder changes.

* Trim dev/scripts pyproject mypy_path to just relevant distributions

Drop the 200+ provider path entries that were blindly copied from the root
pyproject.toml. dev and scripts only import from other non-provider
workspace members, so listing every provider src/tests directory under
mypy_path just adds noise. The remaining non-provider entries cover
everything dev or scripts plausibly import from.

* Install mypy into per-hook venvs from uv.lock via a `mypy` dep group

Each non-provider distribution with a mypy prek hook now declares a
`mypy` dependency group in its pyproject.toml resolving to
`apache-airflow-devel-common[mypy]`. mypy_local_folder.py syncs each
dedicated virtualenv with `uv sync --frozen --project <X> --group mypy`
and runs mypy with `uv run --frozen --project <X> --group mypy` — so
mypy and its type stubs come from the workspace uv.lock, not from an
ephemeral `--with` overlay whose resolution is independent of the main
lockfile. uv.lock is refreshed to include the new group.

Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts,
airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-
tests, docker-tests, kubernetes-tests, and every shared/<dist>
workspace member.

* Drop mypy_path from dev/scripts pyprojects — venv site-packages is enough

After the switch to installing mypy (and every transitive workspace
dependency) directly into each hook's virtualenv via the `mypy` dep
group, workspace packages like airflow, airflow.sdk, airflowctl,
airflow_breeze, tests_common are all available via the venv's
site-packages. mypy resolves them without needing mypy_path entries,
so drop the copied list and leave a short comment explaining why.

* Split mypy-shared into per-distribution hooks and enforce the pattern

Each shared/<dist> workspace member now owns a mypy-shared-<dist> prek hook
backed by its own shared/<dist>/.pre-commit-config.yaml. The single
mypy-shared iterator is gone — mypy_local_folder.py accepts shared/<dist>
as a first-class folder and the per-hook virtualenv now lives at
.build/mypy-venvs/shared-<dist>/ (slash in the folder name is replaced
with a dash in the venv/cache path).

Adds a new check-shared-mypy-hooks prek hook that fails when a
shared/<dist> workspace member is missing its dedicated .pre-commit-
config.yaml, printing the exact YAML to add. Selective-checks emits one
skip entry per dist, enumerated from shared/ at run time. Contributing
docs cover the two-step process for adding a new shared library.

* Pin minimum_prek_version to 0.3.4 consistently across all configs

All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version
already declared by the root config). Previously the nested configs
pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the
root's version check and still trip on stale subproject pins as they
moved between directories.

* Refresh uv.lock after rebase to reflect the `mypy` dep groups

The rebase onto main resolved the uv.lock conflict by taking main's
version, so `uv sync --group mypy` would fail against uv.lock until
the groups added to the per-distribution pyprojects were re-resolved.
Regenerates the lockfile to include them.

* Add explicit selective-checks test for per-shared-dist mypy hook skipping

Verifies that when a file under shared/logging/ changes, only
mypy-shared-logging is kept among the thirteen mypy-shared-* hooks;
all other shared distributions' hooks land in the skip list. Pins the
contract that the runtime enumeration over shared/*/pyproject.toml
works as intended.

* Refresh mypy docs to match the per-hook venv + --group mypy workflow

Fills in the docs that still referenced the pre-split workflow:

- AGENTS.md: mentions `mypy-shared-<dist>` per shared workspace member and
  the `uv sync --group mypy` install path for mypy itself.
- scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run
  locally through mypy_local_folder.py (Breeze image only needed for the
  providers hook).
- dev/breeze/doc/03_developer_tasks.rst: renames stale `mypy-airflow` to
  `mypy-airflow-core`, and expands the cache note to cover the per-hook
  virtualenvs and caches under .build/.
- dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and
  skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl,
  the six test-dir hooks, and mypy-shared-<dist> enumerated at runtime)
  is documented.

* Rename mypy_local_folder.py to run_mypy_full_dist_local_venv_or_breeze_in_ci.py

Updates every .pre-commit-config.yaml entry and prose references so they
point at the new script name. Two shared configs use YAML folded-scalar
entries to stay under the 110-char yamllint limit; updates the validation
script's expected template to match.
(cherry picked from commit 4f3b228)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
potiuk added a commit that referenced this pull request Apr 20, 2026
…icated .build/ venvs (#65492) (#65549)

* Isolate mypy prek hooks, cover all non-provider dirs, and clean up type errors

Each non-provider mypy prek hook now builds and caches its own virtualenv
at .build/mypy-venvs/<hook>/ and its own mypy cache at
.build/mypy-caches/<hook>/. UV_PROJECT_ENVIRONMENT redirects uv away from
the project's .venv, so running the hook never mutates a contributor's
regular development environment while still matching CI's frozen
dependency set. Mypy runs with --follow-imports=silent so each hook only
reports errors for files it owns; transitive code is covered by its own
hook and different venvs no longer produce divergent results on shared
code.

Adds mypy hooks for the non-provider directories that were previously
uncovered: airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, kubernetes-tests, and shared.
The mypy-shared hook iterates every shared/<dist> workspace distribution
and builds a separate venv + cache per distribution so each shared
library is type-checked against its own dependency set.

breeze down --cleanup-mypy-cache additionally removes
.build/mypy-venvs/ and .build/mypy-caches/ so all per-hook state is
wiped alongside the existing .mypy_cache and mypy-cache-volume.

Also fixes pre-existing type errors surfaced by the newly added and
cleaned-up checks: platform-specific ignores for Linux-only
os.posix_fadvise in the shared logging helper, narrower types and
type: ignore where appropriate in shared configuration/observability/
timezones/secrets_backend/secrets_masker, Liskov override markers on
the AirflowConfigParser subclass methods, and small correctness fixes
in dev/breeze and the docker-tests / kubernetes-tests helpers so the
full non-provider mypy suite runs clean on macOS and in CI.

* Move mypy prek hooks to their respective distribution configs

The new mypy hooks for airflow-ctl-tests, helm-tests, airflow-e2e-tests,
task-sdk-integration-tests, docker-tests, and kubernetes-tests now live
in each distribution's own .pre-commit-config.yaml, matching the pattern
already used by airflow-core, task-sdk, and airflow-ctl. New .pre-commit-
config.yaml files are added to distributions that didn't have one. prek
auto-discovers nested configs, so the hooks remain part of the default
check set.

mypy-dev (covers dev + scripts), mypy-devel-common, and mypy-shared stay
at the repo root: dev/scripts/devel-common don't have their own configs,
and mypy-shared iterates every shared/<dist> distribution so has no
single home.

* Split mypy-dev and mypy-scripts, each with its own pyproject.toml config

Previously the mypy-dev prek hook ran mypy against dev/ and scripts/ in
a single invocation under the dev project's virtualenv. The two now get
independent hooks — mypy-dev in dev/.pre-commit-config.yaml and
mypy-scripts in scripts/.pre-commit-config.yaml — so each can evolve its
own dependency set and check its own folder.

Copy the full [tool.mypy] section from the root pyproject.toml into both
dev/pyproject.toml and scripts/pyproject.toml so each sub-project owns its
mypy configuration. Paths inside mypy_path are rewritten from
$MYPY_CONFIG_FILE_DIR/ to $MYPY_CONFIG_FILE_DIR/../ so they still resolve
to the repo-root siblings from the sub-project location. The decorator/
outputs plugins are scoped to dev only (scripts does not author DAG code).

mypy_local_folder.py now passes --config-file <project>/pyproject.toml
when the folder maps to one of these sub-project configs, so mypy uses
the sub-project's configuration rather than the root one.

* Teach selective-checks about the new non-provider mypy hooks

Add FileGroupForCi entries and regex patterns for helm-tests,
airflow-e2e-tests, docker-tests, kubernetes-tests, scripts, and shared
Python files, then wire them into skip_prek_hooks so the corresponding
mypy-* prek hook is only kept when its folder changed:

- mypy-scripts (split off from the old combined mypy-dev)
- mypy-airflow-ctl-tests, mypy-helm-tests, mypy-airflow-e2e-tests,
  mypy-task-sdk-integration-tests, mypy-docker-tests, mypy-kubernetes-tests
- mypy-shared

Update test_selective_checks.py skip-list constants and per-case inline
skip lists to include the new hooks. Targeted test cases for files under
the new-hook directories override skip-prek-hooks to leave the matching
hook out of the skip set, confirming it will run when its folder changes.

* Trim dev/scripts pyproject mypy_path to just relevant distributions

Drop the 200+ provider path entries that were blindly copied from the root
pyproject.toml. dev and scripts only import from other non-provider
workspace members, so listing every provider src/tests directory under
mypy_path just adds noise. The remaining non-provider entries cover
everything dev or scripts plausibly import from.

* Install mypy into per-hook venvs from uv.lock via a `mypy` dep group

Each non-provider distribution with a mypy prek hook now declares a
`mypy` dependency group in its pyproject.toml resolving to
`apache-airflow-devel-common[mypy]`. mypy_local_folder.py syncs each
dedicated virtualenv with `uv sync --frozen --project <X> --group mypy`
and runs mypy with `uv run --frozen --project <X> --group mypy` — so
mypy and its type stubs come from the workspace uv.lock, not from an
ephemeral `--with` overlay whose resolution is independent of the main
lockfile. uv.lock is refreshed to include the new group.

Covers airflow-core, task-sdk, airflow-ctl, devel-common, dev, scripts,
airflow-ctl-tests, helm-tests, airflow-e2e-tests, task-sdk-integration-
tests, docker-tests, kubernetes-tests, and every shared/<dist>
workspace member.

* Drop mypy_path from dev/scripts pyprojects — venv site-packages is enough

After the switch to installing mypy (and every transitive workspace
dependency) directly into each hook's virtualenv via the `mypy` dep
group, workspace packages like airflow, airflow.sdk, airflowctl,
airflow_breeze, tests_common are all available via the venv's
site-packages. mypy resolves them without needing mypy_path entries,
so drop the copied list and leave a short comment explaining why.

* Split mypy-shared into per-distribution hooks and enforce the pattern

Each shared/<dist> workspace member now owns a mypy-shared-<dist> prek hook
backed by its own shared/<dist>/.pre-commit-config.yaml. The single
mypy-shared iterator is gone — mypy_local_folder.py accepts shared/<dist>
as a first-class folder and the per-hook virtualenv now lives at
.build/mypy-venvs/shared-<dist>/ (slash in the folder name is replaced
with a dash in the venv/cache path).

Adds a new check-shared-mypy-hooks prek hook that fails when a
shared/<dist> workspace member is missing its dedicated .pre-commit-
config.yaml, printing the exact YAML to add. Selective-checks emits one
skip entry per dist, enumerated from shared/ at run time. Contributing
docs cover the two-step process for adding a new shared library.

* Pin minimum_prek_version to 0.3.4 consistently across all configs

All .pre-commit-config.yaml files now require prek >= 0.3.4 (the version
already declared by the root config). Previously the nested configs
pinned a mix of 0.2.0, 0.3.2, and 0.3.4, so a contributor could pass the
root's version check and still trip on stale subproject pins as they
moved between directories.

* Refresh uv.lock after rebase to reflect the `mypy` dep groups

The rebase onto main resolved the uv.lock conflict by taking main's
version, so `uv sync --group mypy` would fail against uv.lock until
the groups added to the per-distribution pyprojects were re-resolved.
Regenerates the lockfile to include them.

* Add explicit selective-checks test for per-shared-dist mypy hook skipping

Verifies that when a file under shared/logging/ changes, only
mypy-shared-logging is kept among the thirteen mypy-shared-* hooks;
all other shared distributions' hooks land in the skip list. Pins the
contract that the runtime enumeration over shared/*/pyproject.toml
works as intended.

* Refresh mypy docs to match the per-hook venv + --group mypy workflow

Fills in the docs that still referenced the pre-split workflow:

- AGENTS.md: mentions `mypy-shared-<dist>` per shared workspace member and
  the `uv sync --group mypy` install path for mypy itself.
- scripts/ci/prek/AGENTS.md: clarifies that non-provider mypy hooks run
  locally through mypy_local_folder.py (Breeze image only needed for the
  providers hook).
- dev/breeze/doc/03_developer_tasks.rst: renames stale `mypy-airflow` to
  `mypy-airflow-core`, and expands the cache note to cover the per-hook
  virtualenvs and caches under .build/.
- dev/breeze/doc/ci/04_selective_checks.md: expands the file-group and
  skip-reason lists so every new mypy hook (scripts, task-sdk, airflow-ctl,
  the six test-dir hooks, and mypy-shared-<dist> enumerated at runtime)
  is documented.

* Rename mypy_local_folder.py to run_mypy_full_dist_local_venv_or_breeze_in_ci.py

Updates every .pre-commit-config.yaml entry and prose references so they
point at the new script name. Two shared configs use YAML folded-scalar
entries to stay under the 110-char yamllint limit; updates the validation
script's expected template to match.
(cherry picked from commit 4f3b228)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants