feat(ci): add post-publish smoke test to validate PyPI install by kovtcharov-amd · Pull Request #1239 · amd/gaia

kovtcharov-amd · 2026-05-29T02:09:11Z

After publish.yml publishes to PyPI, there's no verification that the package actually installs and works — a broken wheel can reach users undetected. This adds a post-publish-smoke job that installs amd-gaia from PyPI on a fresh runner and verifies all CLI entry points and the Python import work with the correct version.

Test plan

Review the new post-publish-smoke job in .github/workflows/publish.yml
Confirm needs: [validate, publish-pypi] and if: needs.publish-pypi.result == 'success' are correct
Confirm retry loop handles PyPI propagation delay (5 attempts × 30s)
Confirm no continue-on-error: true — failure is loud
Verify the job runs in parallel with github-release (not blocking it)

Closes #1156

github-actions · 2026-05-29T02:11:59Z

The "Verify Python import" step will throw an AttributeError on every publish run — gaia.__init__.py doesn't export __version__, so the smoke test that's supposed to catch broken releases will itself be broken from day one. One blocking fix needed; everything else looks solid.

Summary

The new post-publish-smoke job correctly chains off publish-pypi, retries for PyPI propagation delay, and fails loudly (no continue-on-error). The dependency graph is clean — github-release runs in parallel since it doesn't need post-publish-smoke. Only one step has a functional bug, but it's in the critical verification path.

Issues Found

🔴 Critical — `gaia.version` doesn't exist (`publish.yml:449–454`)

The "Verify Python import" step runs:

python -c "import gaia; print(gaia.__version__)"

gaia/__init__.py exports Agent, DatabaseAgent, tool, etc. — but not __version__. The version constant lives in gaia.version.__version__ (src/gaia/version.py:9). This step will always raise AttributeError: module 'gaia' has no attribute '__version__', causing the smoke test to fail on every release.

Two equally valid fixes — pick one:

Option A — import from the correct submodule (tests the internal constant):

          IMPORT_VERSION=$(python -c "from gaia.version import __version__; print(__version__)")

Option B — use importlib.metadata (tests what PyPI registered, arguably more appropriate for a post-publish check):

          IMPORT_VERSION=$(python -c "from importlib.metadata import version; print(version('amd-gaia'))")

🟢 Minor — `gaia-emr` not exercised (`publish.yml:440–444`)

setup.py registers four binary entry points: gaia, gaia-cli, gaia-mcp, gaia-code, and gaia-emr. The step tests three of them:

gaia --help
gaia-code --help
gaia-mcp --help

gaia-emr --help is missing. A broken gaia-emr install (e.g. a missing optional dependency imported at module load) would go undetected.

          gaia --help
          gaia-code --help
          gaia-mcp --help
          gaia-emr --help

Strengths

Dependency wiring is correct: needs: [validate, publish-pypi] with if: needs.publish-pypi.result == 'success' is the right pattern. github-release doesn't depend on this job, so they run in parallel — the release isn't blocked while the smoke test runs.
Retry loop is well-constructed: for attempt in 1 2 3 4 5 with a 30 s sleep handles typical PyPI CDN propagation delays (~2.5 min window) without over-engineering.
Fails loudly: No continue-on-error, no silent degradation — consistent with CLAUDE.md's "fail loudly" convention.

Verdict

Request changes — the blocking gaia.__version__ bug will cause the smoke test to fail on the first release that uses this workflow, which defeats its purpose. Fix the import path (either option works), then this is ready to merge.

… point

github-actions · 2026-05-29T03:06:41Z

Solid, focused addition that closes a real gap in the release pipeline. Two minor nits before this is pristine.

Summary

The post-publish-smoke job correctly catches broken PyPI installs that would otherwise reach users silently. The retry loop, dual-verification approach (CLI + import), and loud-failure posture all follow GAIA conventions. Two small issues in the install retry and version-grep logic are worth fixing.

Issues Found

🟢 Minor — `grep -q` uses regex matching on the version string (`.github/workflows/publish.yml:52`)

grep -q "$TAG_VERSION" treats the version as a regex, so dots (.) match any character — 0.18.1 would match 0X181. More practically, it's a substring match, so version 0.18.1 would also match against a gaia --version output of 0.18.10, giving a false positive if an older patch is somehow installed.

The import-version step already does a precise != equality check, so the grep step's false-positive window is narrow — but it's easy to close:

          if ! echo "$VERSION_OUTPUT" | grep -qF "$TAG_VERSION"; then

-F treats the pattern as a fixed string. If you want an exact line match (ruling out the 0.18.10 → 0.18.1 substring case), use grep -qxF instead and ensure the version occupies its own line in the output.

🟢 Minor — `sleep 30` fires on the 5th failed attempt, adding 30 s of dead time (`.github/workflows/publish.yml:38–42`)

On the final loop iteration, pip fails → the script prints "PyPI package not available yet, waiting 30s…" and sleeps before the loop exits. There is no 6th attempt; the sleep is wasted and the message is misleading.

      - name: Install from PyPI (with retry for propagation delay)
        env:
          TAG_VERSION: ${{ needs.validate.outputs.tag_version }}
        run: |
          for attempt in 1 2 3 4 5; do
            echo "Attempt $attempt: pip install amd-gaia==$TAG_VERSION"
            if pip install "amd-gaia==$TAG_VERSION"; then
              echo "Installed amd-gaia==$TAG_VERSION successfully"
              exit 0
            fi
            if [ "$attempt" -lt 5 ]; then
              echo "PyPI package not available yet, waiting 30s..."
              sleep 30
            fi
          done
          echo "ERROR: Failed to install amd-gaia==$TAG_VERSION after 5 attempts"
          exit 1

Strengths

Fail-loud by design. No continue-on-error: true, and the loop's explicit exit 1 after all retries are exhausted is exactly what GAIA expects per the "No Silent Fallbacks" rule.
Two independent verification paths. CLI version via gaia --version and Python import via importlib.metadata catch different classes of failure (entry-point misconfiguration vs. metadata packaging bugs). The importlib.metadata approach avoids the brittle pkg_resources deprecation.
Correct parallelism. The job runs after publish-pypi but does not block github-release, which is the right call — once the wheel is on PyPI it can't be rolled back, so blocking the release page serves no purpose. The design is intentional and sound.

Verdict

Approve with suggestions — both nits are safe to merge as-is, but the grep -qF fix in particular takes 5 seconds and eliminates a latent false-positive class. Recommend applying before merge.

…bypass, gate release grep -qF prevents dots in version strings from being treated as regex wildcards. --no-cache-dir ensures retries actually re-query PyPI instead of serving a cached 404. The final-attempt sleep guard avoids a wasted 30s wait when all retries are exhausted. Most critically, github-release now depends on post-publish-smoke instead of publish-pypi so a broken wheel actually blocks the release.

github-actions · 2026-05-29T06:48:33Z

Solid addition that closes a real gap: before this, a broken wheel could reach users with no automated catch between pypa/gh-action-pypi-publish and the GitHub Release. The implementation is clean and all the previously-reviewed edge cases (final-sleep skip, cache bypass, grep -qF, gaia-emr entry point, release gate) are already resolved in the current HEAD.

Issues Found

🟢 Minor — Test plan item is backwards (description, not code)

The test plan says "Verify the job runs in parallel with github-release (not blocking it)" — but the diff shows the opposite: github-release now needs post-publish-smoke, so the release is explicitly gated on the smoke test passing.

That's actually the correct design (catching a bad wheel before publishing the Release page is the whole point), but the test plan checkbox describes it wrong. Worth correcting in the PR description so reviewers tick the right mental box.

🟢 Minor — Only Python 3.11 tested

The smoke test runs on Python 3.11. If GAIA supports 3.10 (the version in docs/reference/dev.mdx setup instructions), a wheel that breaks on 3.10 would pass the smoke test.

For a smoke test this is acceptable, but noting it so the team can decide whether to add a 3.10 leg in a follow-up.

🟢 Nit — Pre-existing silent fallback in `version.py` (not introduced here)

src/gaia/version.py:24 returns "" on any importlib.metadata exception — a silent fallback that CLAUDE.md flags as tech debt. The smoke test would still catch it (empty version fails grep -qF "$TAG_VERSION"), but the root cause would be obscure. Out of scope for this PR; flagging so it gets tracked.

Strengths

Retry loop is correct: skips the final sleep 30 before the failure message — no pointless 30-second pause on attempt 5 — and the loop exits 0 immediately on first success.
--no-cache-dir + grep -qF: both are the right choices here. Cache bypass prevents a stale pip cache returning an old wheel; fixed-string grep avoids version numbers being mis-interpreted as regex metacharacters.
Dual verification (CLI output + importlib.metadata) catches both "wrong wheel installed" and "entry point broken" independently.
Gate wired correctly: replacing publish-pypi with post-publish-smoke in github-release's needs list means a bad PyPI upload now aborts the Release rather than just printing a warning. This is exactly the fail-loud pattern CLAUDE.md asks for.
No continue-on-error, timeout-minutes: 15, and no checkout step (correct — smoke test should prove PyPI install, not the repo checkout).

Verdict

Approve. One inaccurate test-plan bullet is the only thing worth fixing, and that's a description edit. The CI logic is correct and the smoke-test coverage is a genuine improvement over the prior state.

itomek

The earlier blocking bot finding (gaia.version AttributeError) is stale — head verifies via importlib.metadata and exercises gaia-emr. Verified the version logic lines up: validate.outputs.tag_version is ${TAG_NAME#v} (bare), gaia --version prints the bare importlib.metadata version, and the import step does an exact != check, so both the grep -qF and the equality check behave correctly. One description fix noted inline (the test-plan "runs in parallel" bullet is the opposite of the diff, which correctly gates the Release on the smoke test). Approving.

feat(ci): add post-publish smoke test to validate PyPI install

2f29563

github-actions Bot added the devops DevOps/infrastructure changes label May 29, 2026

kovtcharov-amd requested a review from itomek-amd May 29, 2026 02:16

kovtcharov-amd self-assigned this May 29, 2026

kovtcharov-amd added this to the v0.19 — Test & CI Hardening [OSS] milestone May 29, 2026

fix(ci): use importlib.metadata for version check, add gaia-emr entry…

08df86c

… point

kovtcharov-amd added the p1 medium priority label May 29, 2026

itomek approved these changes May 29, 2026

View reviewed changes

Comment thread .github/workflows/publish.yml

kovtcharov-amd enabled auto-merge May 29, 2026 16:21

kovtcharov-amd added this pull request to the merge queue May 29, 2026

Merged via the queue into main with commit f8ff144 May 29, 2026
27 checks passed

kovtcharov-amd deleted the feat/1156-post-publish-smoke branch May 29, 2026 16:23

kovtcharov-amd mentioned this pull request Jun 1, 2026

Release v0.20.0 #1334

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): add post-publish smoke test to validate PyPI install#1239

feat(ci): add post-publish smoke test to validate PyPI install#1239
kovtcharov-amd merged 3 commits into
mainfrom
feat/1156-post-publish-smoke

kovtcharov-amd commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

itomek left a comment •

edited by itomek-amd

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kovtcharov-amd commented May 29, 2026

Test plan

Uh oh!

github-actions Bot commented May 29, 2026

Summary

Issues Found

🔴 Critical — gaia.__version__ doesn't exist (publish.yml:449–454)

🟢 Minor — gaia-emr not exercised (publish.yml:440–444)

Strengths

Verdict

Uh oh!

github-actions Bot commented May 29, 2026

Summary

Issues Found

🟢 Minor — grep -q uses regex matching on the version string (.github/workflows/publish.yml:52)

🟢 Minor — sleep 30 fires on the 5th failed attempt, adding 30 s of dead time (.github/workflows/publish.yml:38–42)

Strengths

Verdict

Uh oh!

github-actions Bot commented May 29, 2026

Issues Found

🟢 Minor — Test plan item is backwards (description, not code)

🟢 Minor — Only Python 3.11 tested

🟢 Nit — Pre-existing silent fallback in version.py (not introduced here)

Strengths

Verdict

Uh oh!

itomek left a comment • edited by itomek-amd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🔴 Critical — `gaia.version` doesn't exist (`publish.yml:449–454`)

🟢 Minor — `gaia-emr` not exercised (`publish.yml:440–444`)

🟢 Minor — `grep -q` uses regex matching on the version string (`.github/workflows/publish.yml:52`)

🟢 Minor — `sleep 30` fires on the 5th failed attempt, adding 30 s of dead time (`.github/workflows/publish.yml:38–42`)

🟢 Nit — Pre-existing silent fallback in `version.py` (not introduced here)

itomek left a comment •

edited by itomek-amd

Loading