Skip to content

Download providers.json in registry-backfill publish-versions job#66029

Merged
kaxil merged 1 commit intoapache:mainfrom
astronomer:registry-backfill-publish-versions-providers-json
Apr 28, 2026
Merged

Download providers.json in registry-backfill publish-versions job#66029
kaxil merged 1 commit intoapache:mainfrom
astronomer:registry-backfill-publish-versions-providers-json

Conversation

@kaxil
Copy link
Copy Markdown
Member

@kaxil kaxil commented Apr 28, 2026

Summary

The Registry Backfill workflow's publish-versions job runs in a fresh checkout where providers.json is gitignored. breeze registry publish-versions calls publish_registry_versions.py:119-127, which reads providers.json to know which provider IDs to iterate. Without the file present, it raises FileNotFoundError before writing any per-provider versions.json updates to S3 — leaving dropdowns stale even when backfill jobs successfully uploaded the version pages.

Fix

Add a Download providers.json from S3 step in the publish-versions job, between Configure AWS credentials and Publish version metadata. Writes to dev/registry/providers.json; the function's existing inline path fallback (publish_registry_versions.py:121-122) resolves the file from there.

No || true — failure here is a real problem and should fail the job loudly.

Test

New test_inline_fallback_to_dev_registry_providers_json in dev/breeze/tests/test_publish_registry_versions.py locks in the contract this PR depends on: monkeypatch.chdir(tmp_path), places only dev/registry/providers.json (no eleventy data dir copy), asserts publish_versions() resolves it without raising. Mocks boto3.client and invalidate_cloudfront. Runs in 0.04s.

Companion PR

#66027 (Make registry-backfill workflow actually publish backfilled pages) handles the per-version page sync. PR-B (this one) handles the dropdown refresh. Both are needed for a fully-green smoke test (amazon/9.24.0 google/21.0.0 common-compat/1.14.2); they're independent and can land in either order.

Verification after both merge

gh workflow run registry-backfill.yml -R apache/airflow \
  -f destination=staging \
  -f provider-versions="amazon/9.24.0 google/21.0.0 common-compat/1.14.2"

Expect:

  • All backfill jobs + publish-versions job exit 0
  • aws s3 cp s3://staging-docs-airflow-apache-org/registry/api/providers/amazon/versions.json - lists 9.24.0
  • (After CloudFront cache clears) the dropdown on https://airflow.apache.org/registry/providers/amazon/9.24.0/ lists 9.24.0

Linked prior fixes: #65972, #65975, #65987, #65984, #65989, #66027.

`breeze registry publish-versions` calls
`dev/breeze/src/airflow_breeze/utils/publish_registry_versions.py`,
which reads `providers.json` (lines 119-127) to know which provider IDs
to iterate when writing per-provider `versions.json` to S3. The
`publish-versions` job in `.github/workflows/registry-backfill.yml`
runs in a fresh checkout where `providers.json` is gitignored, so the
file lookup raises `FileNotFoundError` and the job fails before any
`versions.json` update reaches S3 -- meaning per-provider dropdowns
stay stale even after backfill jobs successfully upload version pages.

Add a `Download providers.json from S3` step that fetches the cached
file to `dev/registry/providers.json`. The function's existing inline
path fallback resolves it from there.

Tests: new `test_inline_fallback_to_dev_registry_providers_json` in
`dev/breeze/tests/test_publish_registry_versions.py` locks in the
inline fallback contract by monkeypatching cwd, placing only the
`dev/registry/providers.json` (not the eleventy data dir copy), and
asserting `publish_versions()` resolves it without raising.
@kaxil kaxil merged commit 1eefb3c into apache:main Apr 28, 2026
141 checks passed
@kaxil kaxil deleted the registry-backfill-publish-versions-providers-json branch April 28, 2026 20:54
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Airflow Registry Apr 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Backport failed to create: v3-2-test. View the failure log Run details

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 1eefb3c v3-2-test

This should apply the commit to the v3-2-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:registry backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants