Skip to content

Guard registry-build sync against partial extracts#66137

Merged
kaxil merged 1 commit intoapache:mainfrom
astronomer:add-presync-content-guards-to-registry-build
Apr 30, 2026
Merged

Guard registry-build sync against partial extracts#66137
kaxil merged 1 commit intoapache:mainfrom
astronomer:add-presync-content-guards-to-registry-build

Conversation

@kaxil
Copy link
Copy Markdown
Member

@kaxil kaxil commented Apr 30, 2026

Summary

After #66100 (#1305 fix), registry-build.yml now fires on every wave-release dispatch. aws s3 sync exits 0 silently when its source directory is missing or empty, so a partial breeze registry extract-data failure (one provider errors mid-run, Eleventy still produces a tree, sync uploads stale or empty JSON) would currently report green while leaving the live registry in an inconsistent state.

Fix

Mirror the per-provider guards that registry-backfill.yml got in #66027. The new "Verify build emitted expected content" step sits between artifact upload and the S3 sync, and:

  • Incremental (PROVIDER set): for each target provider ID, assert registry/_site/providers/<id>/, registry/_site/api/providers/<id>/, and registry/_site/api/providers/<id>/versions.json exist and are non-empty. versions.json is the canonical per-provider artifact (drives the version-dropdown).
  • Full build (PROVIDER empty): assert top-level registry/_site/index.html, registry/_site/api/providers.json (the global listing), and at least one provider subtree under registry/_site/api/providers/ are present.

Any missing artifact aborts with ::error:: before aws s3 sync is invoked.

Why this matters now

Before #66100 this path only fired on explicit per-provider doc dispatches, where a single failed provider mostly meant nothing got synced (loud failure). With wave dispatches now driving the incremental path automatically, a partial extract over ~22 providers would silently upload a hole-y registry to live S3 and the only signal would be users hitting 404s.

Verification

  • prek run yamllint: pass
  • prek run zizmor (workflow security): pass
  • Standalone actionlint: zero net-new findings vs. main
  • Hand-walked the guard on three input shapes: 22-provider wave (incremental), single explicit dispatch (incremental), full rebuild against main (full).

Was generative AI tooling used to co-author this PR?
  • Yes -- Claude Code (Opus 4.7)

Generated-by: Claude Code (Opus 4.7) following the guidelines

`aws s3 sync` exits 0 silently when its source directory is missing or
empty, so a partial `breeze registry extract-data` failure (one provider
errors mid-run, Eleventy still produces a tree, sync uploads stale or
empty JSON) would currently report green while leaving the live registry
in an inconsistent state.

This adds the same pre-sync content guards `registry-backfill.yml` got
in PR apache#66027: for incremental builds, assert each target provider's HTML
directory + API directory + `versions.json` exist and are non-empty;
for full builds, assert the top-level `index.html`, `api/providers.json`
listing, and at least one provider subtree under `api/providers/` are
present. Any missing artifact aborts the sync with `::error::` before
S3 is touched.
@kaxil kaxil merged commit 8d65635 into apache:main Apr 30, 2026
141 checks passed
@kaxil kaxil deleted the add-presync-content-guards-to-registry-build branch April 30, 2026 20:38
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Airflow Registry Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Backport failed to create: v3-2-test. View the failure log Run details

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 8d65635 v3-2-test

This should apply the commit to the v3-2-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:registry backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants