Download providers.json in registry-backfill publish-versions job#66029
Merged
kaxil merged 1 commit intoapache:mainfrom Apr 28, 2026
Merged
Conversation
`breeze registry publish-versions` calls `dev/breeze/src/airflow_breeze/utils/publish_registry_versions.py`, which reads `providers.json` (lines 119-127) to know which provider IDs to iterate when writing per-provider `versions.json` to S3. The `publish-versions` job in `.github/workflows/registry-backfill.yml` runs in a fresh checkout where `providers.json` is gitignored, so the file lookup raises `FileNotFoundError` and the job fails before any `versions.json` update reaches S3 -- meaning per-provider dropdowns stay stale even after backfill jobs successfully upload version pages. Add a `Download providers.json from S3` step that fetches the cached file to `dev/registry/providers.json`. The function's existing inline path fallback resolves it from there. Tests: new `test_inline_fallback_to_dev_registry_providers_json` in `dev/breeze/tests/test_publish_registry_versions.py` locks in the inline fallback contract by monkeypatching cwd, placing only the `dev/registry/providers.json` (not the eleventy data dir copy), and asserting `publish_versions()` resolves it without raising.
jscheffl
approved these changes
Apr 28, 2026
Contributor
Backport failed to create: v3-2-test. View the failure log Run detailsNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
You can attempt to backport this manually by running: cherry_picker 1eefb3c v3-2-testThis should apply the commit to the v3-2-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continueIf you don't have cherry-picker installed, see the installation guide. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
Registry Backfillworkflow'spublish-versionsjob runs in a fresh checkout whereproviders.jsonis gitignored.breeze registry publish-versionscallspublish_registry_versions.py:119-127, which readsproviders.jsonto know which provider IDs to iterate. Without the file present, it raisesFileNotFoundErrorbefore writing any per-providerversions.jsonupdates to S3 — leaving dropdowns stale even when backfill jobs successfully uploaded the version pages.Fix
Add a
Download providers.json from S3step in thepublish-versionsjob, betweenConfigure AWS credentialsandPublish version metadata. Writes todev/registry/providers.json; the function's existing inline path fallback (publish_registry_versions.py:121-122) resolves the file from there.No
|| true— failure here is a real problem and should fail the job loudly.Test
New
test_inline_fallback_to_dev_registry_providers_jsonindev/breeze/tests/test_publish_registry_versions.pylocks in the contract this PR depends on:monkeypatch.chdir(tmp_path), places onlydev/registry/providers.json(no eleventy data dir copy), assertspublish_versions()resolves it without raising. Mocksboto3.clientandinvalidate_cloudfront. Runs in 0.04s.Companion PR
#66027 (
Make registry-backfill workflow actually publish backfilled pages) handles the per-version page sync. PR-B (this one) handles the dropdown refresh. Both are needed for a fully-green smoke test (amazon/9.24.0 google/21.0.0 common-compat/1.14.2); they're independent and can land in either order.Verification after both merge
Expect:
aws s3 cp s3://staging-docs-airflow-apache-org/registry/api/providers/amazon/versions.json -lists9.24.0https://airflow.apache.org/registry/providers/amazon/9.24.0/lists 9.24.0Linked prior fixes: #65972, #65975, #65987, #65984, #65989, #66027.