Guard registry-build sync against partial extracts#66137
Merged
kaxil merged 1 commit intoapache:mainfrom Apr 30, 2026
Merged
Conversation
`aws s3 sync` exits 0 silently when its source directory is missing or empty, so a partial `breeze registry extract-data` failure (one provider errors mid-run, Eleventy still produces a tree, sync uploads stale or empty JSON) would currently report green while leaving the live registry in an inconsistent state. This adds the same pre-sync content guards `registry-backfill.yml` got in PR apache#66027: for incremental builds, assert each target provider's HTML directory + API directory + `versions.json` exist and are non-empty; for full builds, assert the top-level `index.html`, `api/providers.json` listing, and at least one provider subtree under `api/providers/` are present. Any missing artifact aborts the sync with `::error::` before S3 is touched.
amoghrajesh
approved these changes
Apr 30, 2026
Contributor
Backport failed to create: v3-2-test. View the failure log Run detailsNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
You can attempt to backport this manually by running: cherry_picker 8d65635 v3-2-testThis should apply the commit to the v3-2-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continueIf you don't have cherry-picker installed, see the installation guide. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After #66100 (#1305 fix),
registry-build.ymlnow fires on every wave-release dispatch.aws s3 syncexits 0 silently when its source directory is missing or empty, so a partialbreeze registry extract-datafailure (one provider errors mid-run, Eleventy still produces a tree, sync uploads stale or empty JSON) would currently report green while leaving the live registry in an inconsistent state.Fix
Mirror the per-provider guards that
registry-backfill.ymlgot in #66027. The new "Verify build emitted expected content" step sits between artifact upload and the S3 sync, and:PROVIDERset): for each target provider ID, assertregistry/_site/providers/<id>/,registry/_site/api/providers/<id>/, andregistry/_site/api/providers/<id>/versions.jsonexist and are non-empty.versions.jsonis the canonical per-provider artifact (drives the version-dropdown).PROVIDERempty): assert top-levelregistry/_site/index.html,registry/_site/api/providers.json(the global listing), and at least one provider subtree underregistry/_site/api/providers/are present.Any missing artifact aborts with
::error::beforeaws s3 syncis invoked.Why this matters now
Before #66100 this path only fired on explicit per-provider doc dispatches, where a single failed provider mostly meant nothing got synced (loud failure). With wave dispatches now driving the incremental path automatically, a partial extract over ~22 providers would silently upload a hole-y registry to live S3 and the only signal would be users hitting 404s.
Verification
prek run yamllint: passprek run zizmor(workflow security): passactionlint: zero net-new findings vs. mainmain(full).Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.7) following the guidelines