Skip to content

Add --build-constraint(s) flag support to constraints pipeline#65511

Open
nailo2c wants to merge 7 commits into
apache:mainfrom
nailo2c:feat-54394-add_build_constraints
Open

Add --build-constraint(s) flag support to constraints pipeline#65511
nailo2c wants to merge 7 commits into
apache:mainfrom
nailo2c:feat-54394-add_build_constraints

Conversation

@nailo2c
Copy link
Copy Markdown
Contributor

@nailo2c nailo2c commented Apr 19, 2026

closes: #54394


Hi folks, I know this is a big PR. I'll try my best to explain it concise and clear.

Summary

Add build constraints support to Airflow's constraints pipeline.

Airflow now generates and publishes build-constraints-{python}.txt next to the existing runtime constraints, and install flows pass them to uv pip install --build-constraints or pip install --build-constraint when available.

How it works

flowchart LR
    A["Existing inputs:<br/>uv.lock + workspace pyproject.toml"]
    B["Existing runtime constraints"]
    C["Existing constraints branch"]
    D["Existing install commands (uv pip install / pip install)"]

    X["NEW build constraints generation"]
    Y["NEW build-constraints-3.x.txt"]
    Z["NEW --build-constraints flags"]

    A --> B --> C --> D
    A --> X --> Y --> C
    C --> Z --> D

    classDef existing fill:#ffffff,stroke:#777777,color:#111111;
    classDef added fill:#e7f5ff,stroke:#1971c2,stroke-width:2px,color:#111111;

    class A,B,C,D existing;
    class X,Y,Z added;
Loading

Key design decisions:

  • Build constraints are a single file per Python version, shared across all 3 constraint modes.

  • uv sync paths are excluded because uv sync does not support --build-constraints.

  • Fallback no-constraints paths never include build constraints, matching runtime constraints behavior.

  • Missing auto-inferred build constraints are skipped with a warning for backward compatibility with old constraint branches, while explicitly configured build constraints fail fast if the file or URL is not accessible.

  • Explicit AIRFLOW_BUILD_CONSTRAINTS_LOCATION values fail fast if the file or URL is missing.

  • The generator targets build requirements relevant to Airflow's default wheel-preferred install path, rather than forcing source-build coverage for every locked package.

    Source-scan scope validation

    A separate full-scan script scans all ~749 packages with sdist URLs (no wheel filtering, no cache). Running both against the same uv.lock:

    Production (cold):         24.4s  (110 sdists) → 24 unique build deps
    Production (warm cache):    0.4s  (0 downloads)
    Full scan:                 35.9s  (749 sdists) → 61 unique build deps
    
    Only in full scan (production skipped): 37
    poetry, poetry-core, pdm-backend, mypy, jupyterlab, twine, ...
    Only in production (should be 0): 0
    

    All 37 extra deps come from packages with universal wheels. Under the default wheel-preferred install path, those packages are not expected to enter build isolation, so the production scanner covers the build deps relevant to normal Airflow installs while keeping generation fast.

Review guide

Suggested review order follows the data flow:

Area Main files What to review
Generator scripts/in_container/run_generate_constraints.py Collect workspace build requirements and upstream build requirements from uv.lock, then resolve pinned build dependency versions with uv pip compile.
Install helpers scripts/docker/common.sh Resolve/download build constraints and emit the correct uv/pip install flag.
Install consumers scripts/docker/*.sh
scripts/in_container/install_airflow_and_providers.py
scripts/in_container/install_development_dependencies.py
Pass build constraints to relevant install commands and omit them from fallback paths.
Breeze plumbing dev/breeze/src/airflow_breeze/... Expose --airflow-build-constraints-location and pass it through env vars / Docker build args.
Publishing .github/workflows/generate-constraints.yml
scripts/ci/constraints/*.sh
Upload, diff, and commit build-constraints-*.txt, including first-time untracked files.
Docs installing-from-pypi.rst
MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md
09_release_management_tasks.rst
Document generated files and user-facing uv / pip usage.
Tests scripts/tests/in_container/test_build_constraints.py Unit coverage for resolver behavior, helper behavior, command assembly, and fallback invariants.

Generated/synced files:

  • Dockerfile, Dockerfile.ci were regenerated from the command prek update-inlined-dockerfile-scripts --all-files.
  • Breeze command help docs/images were regenerated after adding the new CLI option.

Verification

  • 72 focused unit tests cover generator, resolver, command assembly, fallback behavior, and regression coverage for transient workspace artifacts, nested sdist pyproject.toml files, and uv conflict diagnostic wording.
  • Manual Breeze generation produced build-constraints-*.txt for all three constraints modes.
Generation: workspace scan, upstream scan, and constraint resolution
$ uv run --project scripts python -c "
import sys; sys.path.insert(0, 'scripts/in_container')
from run_generate_constraints import _collect_workspace_build_reqs
from pathlib import Path
reqs = _collect_workspace_build_reqs(Path('.'))
print(f'{len(reqs)} deps: {sorted(reqs)}')
"
10 deps: ['flit-core', 'gitdb', 'gitpython', 'hatchling', 'packaging', 'pathspec', 'pluggy', 'smmap', 'tomli', 'trove-classifiers']

$ uv run --project scripts python -c "
import sys; sys.path.insert(0, 'scripts/in_container')
from run_generate_constraints import _parse_uv_lock
from pathlib import Path
pkgs = _parse_uv_lock(Path('uv.lock'))
targets = [p for p in pkgs if not p.has_universal_wheel and p.sdist_url]
print(f'{len(pkgs)} total, {len(targets)} targets')
"
893 total, 110 targets

$ uv run --project scripts python -c "
import sys; sys.path.insert(0, 'scripts/in_container')
from run_generate_constraints import _stream_build_reqs_from_sdist, _parse_uv_lock
from pathlib import Path
pkgs = _parse_uv_lock(Path('uv.lock'))
pkg = next(p for p in pkgs if p.name == 'pydantic-core')
reqs = _stream_build_reqs_from_sdist(pkg.sdist_url)
print(f'pydantic-core: {reqs}')
"
pydantic-core: ['maturin>=1.9.4,<2']
Integration: 3 constraint modes all produce build-constraints-*.txt
$ breeze ci-image build --python 3.12 --upgrade-to-newer-dependencies --answer yes
$ breeze release-management generate-constraints \
    --airflow-constraints-mode constraints --run-in-parallel --answer yes
$ breeze release-management generate-constraints \
    --airflow-constraints-mode constraints-source-providers --run-in-parallel --answer yes
$ breeze release-management generate-constraints \
    --airflow-constraints-mode constraints-no-providers --run-in-parallel --answer yes

$ ls files/constraints-3.12/
build-constraints-3.12.txt              constraints-3.12.txt                    diff-constraints-3.12.md
build-deps-cache.json                   constraints-no-providers-3.12.txt       diff-constraints-no-providers-3.12.md
                                        constraints-source-providers-3.12.txt   diff-constraints-source-providers-3.12.md
                                        original-constraints-3.12.txt           original-constraints-no-providers-3.12.txt
                                                                                original-constraints-source-providers-3.12.txt

$ cat files/constraints-3.12/build-constraints-3.12.txt
# (header omitted)
beniget==0.4.2.post1
flit-core==3.12.0
hatchling==1.29.0
maturin==1.13.1
meson-python==0.18.0
setuptools==82.0.1
wheel==0.46.3
... (31 pinned build deps total)
Negative & positive tests: build constraints actually block/allow installation
# Create a test package requiring setuptools>=70
$ mkdir -p /tmp/test-pkg && cat > /tmp/test-pkg/pyproject.toml << 'EOF'
[build-system]
requires = ["setuptools>=70"]
build-backend = "setuptools.build_meta"
[project]
name = "test-build-constraint"
version = "0.1.0"
EOF
$ echo "from setuptools import setup; setup()" > /tmp/test-pkg/setup.py
$ cd /tmp/test-pkg && uv build --sdist && cd -

# Negative: conflicting build constraints → correctly fails
$ echo "setuptools==60.0.0" > /tmp/bad-build-constraints.txt
$ uv pip install /tmp/test-pkg/dist/test_build_constraint-0.1.0.tar.gz \
    --build-constraints /tmp/bad-build-constraints.txt
  × Failed to build `test-build-constraint`
  ├─▶ Failed to resolve requirements from `build-system.requires`
  ╰─▶ Because you require setuptools>=70 and setuptools==60.0.0,
      we can conclude that your requirements are unsatisfiable.

# Positive: matching build constraints → succeeds
$ echo "setuptools==82.0.1" > /tmp/good-build-constraints.txt
$ uv pip install /tmp/test-pkg/dist/test_build_constraint-0.1.0.tar.gz \
    --build-constraints /tmp/good-build-constraints.txt
Installed 1 package in 59ms
 + test-build-constraint==0.1.0

CI Results

We can find the build constraints related log in this PR's CI run.

  1. Generate constraints - Generate constraints for 3.10 on linux/amd64 - Source constraints
source_constraints

  1. Generate constraints - Generate constraints for 3.10 on linux/amd64 - Upload constraint artifacts
upload_constraint_artifacts

  1. Build PROD images - Build PROD Regular image 3.10 - Build PROD images w/ source providers 3.10
build_prod_images
Was generative AI tooling used to co-author this PR?
  • [v] Yes (please specify the tool below)
    Claude Opus 4.6 + Codex 5.4

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg Bot added area:dev-tools area:production-image Production image improvements and fixes backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch kind:documentation labels Apr 19, 2026
@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch 4 times, most recently from 60c5099 to ff23dab Compare April 20, 2026 03:24
@nailo2c nailo2c marked this pull request as ready for review April 20, 2026 05:06
@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch from ff23dab to 0cea6f0 Compare April 22, 2026 21:38
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 23, 2026
@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch from 0cea6f0 to 9b28df3 Compare April 28, 2026 05:28
@potiuk potiuk self-assigned this Apr 30, 2026
@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch from 9b28df3 to 80789d8 Compare May 1, 2026 02:59
Copy link
Copy Markdown
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through the build-constraints generator, the install-flow plumbing, and the Dockerfile/Dockerfile.ci changes. The feature is well-implemented and the test coverage on the generator (859 lines) is genuinely thorough — caching, PEP 503 normalisation, conflict-retry, sdist tar/zip handling, legacy setuptools+wheel fallback, all covered. The uv sync-exclusion path (Dockerfile.ci:1112-1114) is correctly scoped, and the explicit-vs-inferred fail-fast contract (install_airflow_and_providers.py:235-264) matches what the PR description promises.

A few observations before I'd want to sign off — none blocking, but the regex fragility in particular is worth a hardening pass.

Conflict-detection regex is brittle to future uv error-message changes (scripts/in_container/run_generate_constraints.py:809-817)

The retry loop derives the conflicting package name by regex-matching uv pip compile's stderr. If uv reformats its error wording (e.g. drops the <>=!~ operator from the rendered conflict line, switches to a structured emoji prefix, or wraps long names), this regex silently no-matches → the loop continues without retrying → the next iteration raises "uv pip compile failed" with no useful context. The tests (lines 644–772) cover the current phrasings well (markers, underscores, multiple wordings) but stay inside the assumption that the regex shape is stable.

Two cheap mitigations:

  1. If no conflict pkg is extracted from stderr after a failed compile, log the raw stderr at WARNING and raise rather than swallowing — so a uv upgrade that breaks the regex fails loudly with the actual stderr in the build log.
  2. Pin a minimum uv version in the script's docstring / hook config and bump it whenever you re-verify the regex against a new release.

Backward-compat claim in the PR body is slightly narrower than implemented

The PR body says "Auto-inferred missing build constraints are skipped for backward compatibility with old constraint branches." That's true for inferred URLs (install_airflow_and_providers.py:259-264 warns + returns None), but explicit AIRFLOW_BUILD_CONSTRAINTS_LOCATION values fail hard with sys.exit(1) (line 235-239). That's the right behaviour — but please make it explicit in the PR body / the user-facing doc that only the inferred path is forgiving; users opting in explicitly will see hard failures on a missing file/URL, which is fine but worth flagging.

Smaller observations

  • scripts/in_container/install_airflow_and_providers.py:229target = Path("/tmp/build-constraints.txt") is hardcoded while Dockerfile/Dockerfile.ci download to ${HOME}/build-constraints.txt. The two paths don't actually collide (the Python script downloads its own copy independently), but the inconsistency is a hygiene smell. Consider either using ${HOME}/build-constraints.txt (matching the Dockerfile) or tempfile.NamedTemporaryFile().
  • scripts/in_container/run_generate_constraints.py:651-652except Exception: cache = {} discards the original parse error without logging. If the cache file is corrupted, the next run silently rescans everything; an observer has no way to know why the rescan happened. A single log.warning("Build-constraints cache at %s is corrupted (%s); rebuilding", cache_path, exc) before the reset would help post-mortem.
  • airflow-core/docs/installation/installing-from-pypi.rst — the new docs explain the feature well, but don't call out that the --build-constraint(s) flag requires the corresponding pip (≥ 23.x) / uv version. Worth a one-liner so users on older pip don't hit confusing errors.
  • Integration coverage — the unit tests are excellent for the generator's internal logic, but there's no end-to-end test that the generated build-constraints-{python}.txt is actually consumed correctly by pip install --build-constraint against a real package set. Could be a CI-only smoke test against a single-package fixture.

Positive

  • Generator handles tar.gz (streaming) and zip (full download) sdists defensively, with a 10k-member cap on tar to bound memory (run_generate_constraints.py:581-630).
  • Cache invalidation correctly drops stale entries when uv.lock changes — verified by the dedicated test at test_build_constraints.py:375-403.
  • Dockerfile changes are pure ARG/ENV plumbing; no new install steps that affect images for users who don't opt in.
  • Breeze wiring is consistent across all four entry points (ci-image, prod-image, developer/shell, release-management) — no missing surface.

This review was drafted by an AI-assisted tool and confirmed by an Apache Airflow maintainer. The findings below are observations, not blockers; an Apache Airflow maintainer — a real person — will take the next look at the PR. If you think a finding is mis-applied, please reply on the PR and a maintainer will weigh in.

More on how Apache Airflow handles maintainer review: contributing-docs/05_pull_requests.rst.

@nailo2c
Copy link
Copy Markdown
Contributor Author

nailo2c commented May 11, 2026

Conflict-detection regex is brittle to future uv error-message changes (scripts/in_container/run_generate_constraints.py:809-817)

Makes sense-- I've checked that uv has changed the error msg style many times (same as ruff), so there is silent failure risk.

Here is an idea, we can add a smoke test like below:

@pytest.mark.integration
def test_conflict_auto_resolution_with_real_uv(tmp_path):
    """Smoke test: real uv with a known conflict - proves the regex
    actually matches uv's current stderr format."""
    build_reqs = {
        "pkg_a": {"cython>=3.0,<3.1"},
        "pkg_b": {"cython>=3.1.2"},
    }
    output_path = tmp_path / "build-constraints.txt"
    _resolve_build_requirements(build_reqs, output_path, config_params=...)
    assert output_path.exists() 

Then we can detect failure through the canary run.


Backward-compat claim in the PR body is slightly narrower than implemented

Already fixed in docs and PR description.


Smaller observations - scripts/in_container/install_airflow_and_providers.py:229

I chose ${HOME}/build-constraints.txt.


Smaller observations - scripts/in_container/run_generate_constraints.py:651-652

Already added a warning log.


Smaller observations - airflow-core/docs/installation/installing-from-pypi.rst

Fixed.


Smaller observations - Integration coverage

Already added an E2E test.

@potiuk potiuk removed the ready for maintainer review Set after triaging when all criteria pass. label May 18, 2026
@potiuk potiuk marked this pull request as draft May 18, 2026 10:48
@potiuk
Copy link
Copy Markdown
Member

potiuk commented May 18, 2026

@nailo2c — Removing the ready for maintainer review label and converting back to draft. The branch now has merge conflicts with main that surfaced after the label was added.

The label's contract is that the PR is ready for maintainer review — a regression like this means the PR temporarily isn't. Rebase your branch onto the latest main, resolve conflicts, then mark "Ready for review" again to re-enter the queue.

git fetch upstream main && git rebase upstream/main, resolve, git push --force-with-lease. See the working-with-git docs.

No rush.


Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you.

@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch from f8e6880 to 91dd704 Compare May 18, 2026 17:01
@nailo2c nailo2c marked this pull request as ready for review May 18, 2026 21:08
@nailo2c nailo2c force-pushed the feat-54394-add_build_constraints branch from 91dd704 to de520b2 Compare May 18, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:production-image Production image improvements and fixes backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support to --build-constraint(s) flag for our constraint preparation

2 participants