Skip to content

chore(skills): add gaia-release skill for end-to-end release flow#939

Open
itomek-amd wants to merge 3 commits intomainfrom
upbeat-bohr-44ac83
Open

chore(skills): add gaia-release skill for end-to-end release flow#939
itomek-amd wants to merge 3 commits intomainfrom
upbeat-bohr-44ac83

Conversation

@itomek-amd
Copy link
Copy Markdown
Collaborator

Summary

Adds a project-local Claude skill at .claude/skills/gaia-release/SKILL.md that walks through a GAIA release end-to-end as a phased checklist with hard gates before every irreversible step (open PR, push tag, rerun CI job, post announcement). Lives in the repo so other contributors get it via git.

Why

The v0.17.4 release uncovered two release-blocking bugs that merged-PR CI alone did not catch — a squash-merge silently reverting src/gaia/version.py, and the AppImage smoke test's circular pip install amd-gaia==<unpublished> dependency. Both were caught only because a human ran a manual pre-tag verification pass. This skill encodes that pass (Phase 3) so the next release doesn't rely on the same human remembering the same lesson.

Threads

  • Six phased gates — Draft notes → open PR → pre-tag verify → push tag → monitor → announce. Each phase produces a concrete artifact, then stops at a ?-terminated gate before doing anything destructive. Reviewer keeps final authority on every irreversible action.
  • Hard rules block up front — Mirrors CLAUDE.md: no Claude attribution anywhere, no silent fallbacks, factual house style for notes, scope-clean commits, manual approval gate is human-only.
  • Discord announcement template inline — Phase 6 carries the full template with the format rules derived from comparing the v0.17.3 and v0.17.4 announcements (no leading bullet markers, plain install lines, fixed boilerplate sentences, scope-matching opening framing).
  • Argument convention matches existing tags — Accepts 0.17.5, v0.17.5, or four-part hotfix form 0.15.4.1; auto-derives next-patch from version.py if omitted; refuses to roll backwards.

Test plan

  • Trigger phrases work: typing "cut a release", "release v0.17.5", "tag a release" surfaces the skill via the Skill tool
  • Skill stops at every gate (does not auto-proceed past Phase 1 → 2, Phase 3 → 4, etc.)
  • Phase 3 (pre-tag verification) cites the v0.17.4 lessons explicitly so it can't be silently skipped

Phased checklist (Phase 1-6) with hard gates before every irreversible step:
draft notes, open release PR, pre-tag verification, push tag, monitor publish
pipeline, draft Discord announcement. Lives in .claude/skills/ so contributors
get it via git.

Pre-tag verification phase exists because the v0.17.4 release uncovered two
release-blocking bugs that merged-PR CI alone did not catch (squash-merge
silently reverting version.py, and the AppImage smoke test's circular
pip-install dependency on the unpublished version). Skill cites both lessons
inline so future invocations don't talk past the gate.
@itomek-amd itomek-amd requested a review from kovtcharov-amd as a code owner May 1, 2026 19:26
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Review — chore(skills): add gaia-release skill for end-to-end release flow

Summary

This adds a single 341-line file at .claude/skills/gaia-release/SKILL.md codifying the GAIA release flow as a six-phase checklist with hard gates before every irreversible action. The skill is documentation-only, technically accurate (every referenced path and workflow exists), and explicitly encodes the v0.17.4 lessons (squash-merge version.py regression, AppImage pip install circularity) into Phase 3 so they cannot be silently skipped. Compliance with CLAUDE.md is strong: hard rules section calls out "no Claude attribution", "no silent fallbacks", and bulletproof commits up front.

The most important thing the author should know: a couple of small inconsistencies between phases (search syntax, shell-variable lifetime across gate pauses, missing gh run watch in Phase 3) — none blocking, all easy fixes that make the skill more robust when re-invoked across sessions.

Issues Found

🟢 Minor — Inconsistent gh pr list --search query (SKILL.md:53, SKILL.md:155)

Phase 1 step 1 uses the bare query "Release v" (matches anywhere in the PR), but Phase 2 step 2 uses "Release v in:title" (title-only, more precise). The bare form will fuzzy-match unrelated PRs that mention "release v…" in the body. Recommend unifying both to the title-qualified form.

   gh pr list --repo amd/gaia --state merged --search "Release v in:title" --limit 3

(Apply at line 53 in SKILL.md. Phase 2 already uses the correct form.)

🟢 Minor — $MERGED_SHA doesn't survive across gate pauses (SKILL.md:188-189, SKILL.md:232)

Phase 3 step 1 sets MERGED_SHA=$(git rev-parse HEAD) in a shell, then Phase 4 step 1 runs test "$(git rev-parse HEAD)" = "$MERGED_SHA". Between phases the user gets a ?-terminated gate and likely returns later — possibly in a fresh shell or after re-invoking the skill — at which point $MERGED_SHA is empty. The test will fail (good), but for the wrong reason (empty string, not "main moved").

Two cleaner options: (a) capture the SHA into the gate question text so the user/skill re-confirms it explicitly at Phase 4, or (b) re-derive it in Phase 4 from the merged release PR (gh pr view <release-pr> --json mergeCommitOid). The latter is more robust because it doesn't depend on shell state at all:

2. **Tag and push.**
   ```bash
   # Re-derive the merged SHA from the release PR (don't rely on shell state across gates)
   MERGED_SHA=$(gh pr view <release-pr-number> --repo amd/gaia --json mergeCommitOid -q .mergeCommitOid)
   test "$(git rev-parse HEAD)" = "$MERGED_SHA" || { echo "main moved — re-verify"; exit 1; }
   git tag v<version>
   git push origin v<version>

### 🟢 Minor — Phase 3 step 4 says "Watch it" without a concrete command (`SKILL.md:208`)

Phase 5 step 1 gives `gh run watch <run-id> --repo amd/gaia`. Phase 3's "watch it" should mirror that for consistency:

```suggestion
   ```bash
   gh workflow run "Build Installers" --repo amd/gaia --ref main
   sleep 5
   RUN_ID=$(gh run list --repo amd/gaia --workflow "Build Installers" --limit 1 --json databaseId -q '.[0].databaseId')
   gh run watch "$RUN_ID" --repo amd/gaia

AppImage smoke jobs (distro-matrix, userns-restricted) are the most flakyuserns-restricted has a 90s state: ready poll that can race a model download. On real failure, stop and fix root cause; on transient flake (timeout-only, no logic error), gh run rerun <run-id> --failed is acceptable.


### 🟢 Minor — Stale Lemonade version in the navbar example (`SKILL.md:109`)

The example reads `v0.17.4 · Lemonade 10.2.0`, but `docs/docs.json:464` currently has `v0.17.4 · Lemonade 10.0.0` while `src/gaia/version.py:12` has `LEMONADE_VERSION = "10.2.0"`. The skill correctly tells the runner to read `version.py`, so this won't cause incorrect behavior — but the example showing a state that doesn't actually exist on `main` is mildly confusing. Consider using `<previous>` placeholders instead of a concrete (and incorrect) value, e.g. `v<previous-version> · Lemonade <previous-lemonade>`.

## Strengths

- **Pre-tag verification (Phase 3) is the load-bearing change.** It re-runs `validate_release_notes.py` and Build Installers against the *merged* SHA before tagging — exactly the gap that let the v0.17.4 squash-merge regression slip through. Concrete, motivated, hard to skip.
- **Hard rules block up front (`SKILL.md:29-39`) mirrors `CLAUDE.md` precisely** — no Claude attribution, no silent fallbacks, factual house style for notes. Future contributors don't need to cross-reference; the operational guidance is co-located with its rationale.
- **Discord announcement format rules are derived empirically (`SKILL.md:318-323`)** — "no leading bullet markers", "fixed boilerplate sentences", "first-line framing reflects release character" are all observably true of v0.17.3/v0.17.4 announcements. This is the right way to document tribal knowledge.
- **Argument parsing refuses to roll backwards (`SKILL.md:27`)** — small detail, prevents a real foot-gun.
- **Workflow phase mapping (`SKILL.md:247`) matches `.github/workflows/publish.yml` jobs exactly** (`validate → build-* → approve-publish → publish-* → github-release`), so future CI maintainers can trace the skill back to the workflow.

## Verdict

**Approve with suggestions.** No blocking issues. The four 🟢 items are polish — particularly the `$MERGED_SHA` lifetime fix — and worth applying because the skill is explicitly designed to survive multi-session invocations across gates. Safe to merge once those (or the author's preferred equivalents) land.

itomek added 2 commits May 1, 2026 15:30
Test-driving the skill against a hypothetical v0.17.5 surfaced four gaps:

- Skill accepted the requested version without checking it against commit
  scope. 26 commits since v0.17.4 including 5 feat commits (Gemma 4 default,
  Chat Lite, Eval Toolchain v0.18.0 milestone) would have silently shipped
  as a patch. Phase 1 step 1 now runs a scope-vs-version rubric and pushes
  back when the requested version doesn't fit.

- Reinvocation mid-flow (a release spans days) would re-run Phase 1 and
  overwrite already-merged release notes. Added a "Resume detection" block
  at the top: detects PR/tag/release state and skips ahead to the right
  phase, never re-running Phase 1 against merged notes.

- Phase 2 referenced a "prepared body file" without saying how to prepare
  it. Added the awk pipeline to strip MDX frontmatter from the release
  notes and append the Release checklist section from the previous PR.

- Phase 1 validate step assumed bare `python` works. Added venv guidance
  including the worktree-uses-parent-venv case.
Four polish fixes from the review on PR #939:

- Unify gh pr search to title-only (in:title) in the hard-rules example so
  the bare "Release v" form doesn't fuzzy-match unrelated PRs.

- Stop relying on $MERGED_SHA across gate pauses (a release spans days; the
  user often returns in a fresh shell). Phase 3 step 1 now re-derives the
  SHA from the merged release PR via gh pr view --json mergeCommitOid.
  Gate 3's question text now carries the release PR number and SHA so
  Phase 4 can re-derive them with no shell-state dependency.

- Phase 3 step 4 said "watch it" without giving the command. Added the
  concrete gh run watch invocation matching Phase 5's style.

- Phase 1 step 4 had a hardcoded Lemonade version in the navbar example
  that drifts (and is currently inconsistent with version.py on main).
  Replaced with placeholders and a note that version.py is the source of
  truth.
@itomek itomek self-assigned this May 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Summary

This adds a single project-local skill (.claude/skills/gaia-release/SKILL.md, +410/-0) that codifies the GAIA release flow as six phased gates, with the v0.17.4 lesson (squash-merge version.py regression + AppImage smoke circular dep) baked in as Phase 3. The structure is good — phases produce concrete artifacts, gates terminate with ?-questions before destructive actions, and the resume table makes the skill safe to re-invoke mid-flow. The single thing worth fixing before merge: every python util/validate_release_notes.py invocation in the skill is missing the required file positional argument and will exit 2 the moment the user runs it.

Issues Found

🟡 Important

1. Validator command is missing required argument — will fail at first run (SKILL.md:166, SKILL.md:264)

util/validate_release_notes.py requires at least one file path (parser.add_argument("file", nargs="+", ...)). I confirmed locally:

$ python util/validate_release_notes.py
usage: validate_release_notes.py [-h] [--tag TAG] file [file ...]
validate_release_notes.py: error: the following arguments are required: file

The skill currently calls it with no args in Phase 1 step 7 and Phase 3 step 3, both of which gate the release. As written, the very first validator gate will fail and the user will be told (per the skill's own "no silent fallbacks" rule) to stop — even on a perfectly valid release.

Suggested fix in both spots:

   python util/validate_release_notes.py docs/releases/v<version>.mdx

(Or docs/releases/*.mdx if you want to validate the whole set; matches how the script is wired for nargs="+".)

🟢 Minor

2. --verbose hint references a flag the validator doesn't support (SKILL.md:168)

Re-running the validator with --verbose (if supported) helps localise the failure.

The "if supported" hedge is technically safe, but validate_release_notes.py has no such flag and never has — argparse will reject it. Either drop the sentence or replace with a concrete debug suggestion (e.g., "re-run with --tag v<version> to also verify the title matches").

   If it fails for reasons unrelated to your changes (missing dep, broken import), stop and surface that — do not silently bypass.

3. Stray space inside command substitution (SKILL.md:252)

test "$(git rev-parse HEAD)" = "$MERGED_SHA" || { echo "Local main ($( git rev-parse HEAD)) does not match ...

Cosmetic — $( git ...) works in bash but reads as a typo next to the matched $(git rev-parse HEAD) immediately above.

test "$(git rev-parse HEAD)" = "$MERGED_SHA" || { echo "Local main ($(git rev-parse HEAD)) does not match merged release PR SHA ($MERGED_SHA) — pull, or main has moved past the release commit"; exit 1; }

4. Phase 2 staged file set may miss docs/releases/index.mdx (SKILL.md:185)

The hardcoded git add docs/releases/v<version>.mdx docs/docs.json src/gaia/version.py src/gaia/apps/webui/package.json covers the common-case patch release, but docs/releases/index.mdx is occasionally updated alongside a release (e.g. commit c582d05c for v0.16.1). If a contributor edits the index card and forgets to stage it, the skill's "scope-clean" check at git status will correctly halt — but the skill could mention this proactively rather than letting users discover it the hard way. One-liner addition is sufficient; not blocking.

Strengths

  • Phase 3 is the load-bearing contribution. Naming the v0.17.4 root causes inline and gating the tag push behind a workflow_dispatch build of the merged SHA is exactly the missing safety net. The "do not skip it" framing in the description and again at the phase header makes it hard to bypass casually.
  • Resume table is thorough. The five-state breakdown (fresh / PR open / PR merged no tag / tag no release / release exists) covers every realistic re-entry point, and the "Half-finished prior attempt landed notes without a PR — stop and ask" row catches the exact failure mode that would otherwise get silently overwritten.
  • Carries $RELEASE_PR and $MERGED_SHA into the gate question text (Phase 3 → 4) rather than relying on shell variables surviving the user pause. Small detail, but the kind of thing that breaks skills in practice.
  • House-style guardrails are concrete, not vague. "No 'finally' / 'silently' / 'we're excited to announce'", "patch releases do not include a pip install block", "no leading bullet markers in Discord highlights" — these are the rules that get violated when a skill says "match house style" without saying what house style means.
  • Hard-rules block at the top maps 1:1 to CLAUDE.md (no Claude attribution, no silent fallbacks, scope-clean commits), which makes drift between the two documents easy to spot in future reviews.

Verdict

Approve with suggestions — Issue 1 (validator command) is a real bug that breaks the skill's first verification gate and should be fixed before merge; the rest are polish. None of the findings change the design or scope of the skill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants