ci(tla): auto-request Claude + Codex review on anchored-file changes#857
Conversation
Per user request: when a PR touches a file the TLA+ safety spec has an
anchor on (per §3 of docs/design/2026_05_28_partial_tla_safety_spec.md),
AI review must always fire — never rely on a human remembering to ping.
New workflow .github/workflows/tla-spec-ai-review.yml:
- Fires on PR opened/synchronize/reopened/ready_for_review with the
same path filter as tla-check.yml (so the two stay paired).
- Computes which anchored files actually changed (via git diff against
the PR base) and skips the comment if none.
- Posts a fresh PR comment per push that contains both `@claude review`
and `@codex review` mentions plus a per-subsystem checklist of which
invariants to verify (HLC-4 preconditions, OCC lock encoding, etc.).
- Always createComment (not updateComment) — claude.yml listens on
`issue_comment: created` only; an edit would not re-trigger Claude.
Codex behaves the same way. Each push therefore guarantees both
bots fire on the new diff.
- Embeds a marker comment so downstream tooling can identify and
optionally collapse older requests.
Chain:
anchored file change
-> tla-check.yml runs TLC (catches *spec* bugs)
-> tla-spec-ai-review.yml posts the @claude / @codex ping
-> claude.yml + chatgpt-codex-connector bot fire focused
spec-divergence reviews
The two workflows are complementary — the model check verifies the
spec; the AI reviewers verify the impl matches the spec.
Docs: §7.3 of the design doc updated with the chain diagram, the
two-workflow split, and a reminder that the path filter must stay in
sync across the two YAML files and the doc § list.
actionlint clean.
|
You have reached your Codex usage limits. You can see your limits in the Codex usage dashboard. |
|
Warning Review limit reached
More reviews will be available in 39 minutes and 57 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR introduces spec-divergence AI review automation: a new GitHub Actions workflow that detects changes to TLA+ specification files and posts a PR comment triggering Claude and Codex to review the implementation against the spec. Documentation describes the workflow's integration with existing spec-check processes. ChangesSpec-divergence AI Review
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~10 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request updates the TLA safety specification design document to describe a new 'Spec-divergence AI review' companion workflow that automates AI reviews to prevent implementation drift from the specification. The reviewer suggested a minor improvement to explicitly reference Section 7.3 instead of using the generic term 'the doc § list' to improve clarity and maintainability.
| anchored file, so a divergence cannot land silently. The path filter | ||
| must stay in sync across all three places (the two workflow files and | ||
| the doc § list). |
There was a problem hiding this comment.
To improve clarity and maintainability, consider explicitly referencing Section 7.3 instead of using the generic term the doc § list. This makes it immediately clear to future maintainers which section of the document contains the path filter list that must be kept in sync.
| anchored file, so a divergence cannot land silently. The path filter | |
| must stay in sync across all three places (the two workflow files and | |
| the doc § list). | |
| anchored file, so a divergence cannot land silently. The path filter | |
| must stay in sync across all three places (the two workflow files and | |
| the list in §7.3). |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/tla-spec-ai-review.yml:
- Line 53: The workflow uses unpinned actions; replace the `uses:
actions/checkout@v6` and `uses: actions/github-script@v7` references with their
corresponding full commit SHAs (pin to immutable commit identifiers) and update
the checkout step (the block that calls actions/checkout) to include
`persist-credentials: false` so credentials are not left in the workspace;
ensure you update both `actions/checkout` and `actions/github-script`
occurrences to the chosen commit SHAs consistently.
- Around line 52-55: The checkout step named "Check out (needed for diff)" using
actions/checkout@v6 currently leaves credentials persisted; update that step to
include persist-credentials: false under its with block so the checkout does not
store GITHUB_TOKEN in the local git config (ensure the key is added alongside
fetch-depth: 0 in the same with section for the actions/checkout@v6 step).
In `@docs/design/2026_05_28_partial_tla_safety_spec.md`:
- Around line 579-586: The fenced code block that begins with "anchored-file
change in PR" is missing a language identifier and triggers MD040; update the
opening fence from ``` to ```text so the block becomes a labeled "text" code
fence (i.e., replace the triple backticks that start the block with ```text) to
satisfy the markdown linter while leaving the block content unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1525b64c-2218-4297-b4ef-dc8cc4d4efea
📒 Files selected for processing (2)
.github/workflows/tla-spec-ai-review.ymldocs/design/2026_05_28_partial_tla_safety_spec.md
| timeout-minutes: 5 | ||
| steps: | ||
| - name: Check out (needed for diff) | ||
| uses: actions/checkout@v6 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tla-spec-ai-review.yml"
test -f "$FILE" || { echo "Missing $FILE"; exit 1; }
echo "=== File: $FILE ==="
nl -ba "$FILE" | sed -n '1,140p'Repository: bootjp/elastickv
Length of output: 176
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tla-spec-ai-review.yml"
test -f "$FILE" || { echo "Missing $FILE"; exit 1; }
echo "=== File: $FILE (lines 1-140) ==="
awk 'NR>=1 && NR<=140 {printf "%5d %s\n", NR, $0}' "$FILE"
echo
echo "=== Grep uses: / persist-credentials ==="
rg -n "uses:|persist-credentials" "$FILE" || trueRepository: bootjp/elastickv
Length of output: 8211
Pin uses: actions to full commit SHAs and harden checkout credentials
Unpinned actions in .github/workflows/tla-spec-ai-review.yml (uses: actions/checkout@v6 at line 53 and uses: actions/github-script@v7 at line 77) weaken CI supply-chain integrity—pin both to immutable commit SHAs. Also set persist-credentials: false in the checkout step to avoid unnecessary credential persistence.
Suggested change
- - name: Check out (needed for diff)
- uses: actions/checkout@v6
+ - name: Check out (needed for diff)
+ uses: actions/checkout@<FULL_COMMIT_SHA>
@@
- - name: Post AI-review request comment
+ - name: Post AI-review request comment
if: steps.changed.outputs.count != '0'
- uses: actions/github-script@v7
+ uses: actions/github-script@<FULL_COMMIT_SHA>🧰 Tools
🪛 zizmor (1.25.2)
[error] 53-53: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/tla-spec-ai-review.yml at line 53, The workflow uses
unpinned actions; replace the `uses: actions/checkout@v6` and `uses:
actions/github-script@v7` references with their corresponding full commit SHAs
(pin to immutable commit identifiers) and update the checkout step (the block
that calls actions/checkout) to include `persist-credentials: false` so
credentials are not left in the workspace; ensure you update both
`actions/checkout` and `actions/github-script` occurrences to the chosen commit
SHAs consistently.
gemini medium on PR #857: "the doc § list" was an imprecise back- reference; the path-filter list lives in §7.3 specifically, so name that section so future maintainers know exactly where the third sync point is when updating the path filter for M2..M5.
|
Addressed gemini medium on @claude review |
|
Claude finished @bootjp's task in 3m 35s —— View job Review:
|
Five findings from coderabbit + claude bot on commit 96223e9. - persist-credentials: false on the checkout step (zizmor `artipacked`, coderabbit Major, claude #3). The job does not push anything and the subsequent github-script step uses secrets.GITHUB_TOKEN explicitly, so the persisted credential is dead weight that a compromised later step could otherwise reuse. - MD040 language tag on the §7.3 chain-diagram fence (coderabbit Minor, claude #1). Tagged as `text`. - Re-draw the chain diagram so it visually shows tla-check.yml and tla-spec-ai-review.yml fire in parallel rather than implying a sequential chain (claude #6). - Sync-point count corrected from "three" to "four": (1) tla-check.yml paths, (2) tla-spec-ai-review.yml paths, (3) the ANCHORS regex in tla-spec-ai-review.yml, (4) the §7.3 list (claude #5). Both the doc prose and the inline workflow comment now enumerate explicitly. - Inline comment in the workflow explaining why `ready_for_review` is in the trigger types here but not in tla-check.yml (claude #4): an AI review request is cheap on draft→ready transitions; a redundant TLC run is not. Skipped (consistent with repo-wide convention, separate hardening pass): pinning actions/checkout@v6 and actions/github-script@v7 to commit SHAs. tla-check.yml and every other workflow in this repo floats on major version tags; introducing SHA pins here alone would be inconsistent. actionlint clean.
|
Round 1 fixes in
Skipped (deferred to a repo-wide hardening pass): pinning
@claude review |
|
Claude finished @bootjp's task in 2m 23s —— View job Claude finished @bootjp's task — Review: Round 1 fixes on
|
| # | Fix | Status |
|---|---|---|
| 1 | persist-credentials: false with explanatory comment |
✅ Line 67, well-commented |
| 2 | MD040 text language tag on chain-diagram fence |
✅ Line 579 |
| 3 | Parallel branch diagram using ├── / └── |
✅ Visually clear |
| 4 | Sync-point count 3→4 in YAML comment (lines 76-82) and doc prose (line 594-600) | ✅ Consistent |
| 5 | ready_for_review asymmetry documented at lines 25-31 |
✅ Clear rationale |
Remaining nits (none block merge)
1. "copy of tla-check.yml" is imprecise (tla-spec-ai-review.yml:21-23)
The header comment says:
"The path filter is intentionally a copy of tla-check.yml + the design doc §7.3 list"
But the paths here are a subset of tla-check.yml's paths — tla/**, Makefile, and .github/workflows/tla-check.yml are intentionally omitted (correct: spec changes don't need AI impl review). The word "copy" misleads a future maintainer comparing the two files. The PR description's "exact copy" claim has the same inaccuracy.
Suggested fix:
-# The path filter is intentionally a copy of tla-check.yml + the design
-# doc §7.3 list, so a change to an anchored file always triggers both
-# the model check AND the AI review request.
+# The path filter is the implementation-anchored subset of tla-check.yml's
+# paths (tla/**, Makefile, and the workflow self-reference are omitted —
+# spec-only changes don't need an AI impl review, tla-check is sufficient).
+# A change to any anchored file triggers both the TLC run AND the AI review.2. "the list in this §7.3" is self-referential (doc line 600)
Item 4 of the sync enumeration reads "the list in this §7.3", but §7.3 contains only the four-item enumeration — not a standalone list of paths. A maintainer expanding for M2 looking for the canonical list won't find one. The Section 3 table ("Implementation anchors" column) is the closest thing, but it's not referenced here.
Either add a concrete path list to §7.3, or change item 4 to point to §3's implementation-anchors table:
-4. the list in this §7.3.
+4. the implementation anchors in the §3 table (and the narrative here in §7.3).3. Undocumented path filter delta (minor, deferred is fine)
The delta between the two workflows (tla/**, Makefile, .github/workflows/tla-check.yml absent from tla-spec-ai-review.yml) is explained in the ready_for_review comment block but isn't flagged at the paths: block itself. A one-line comment there would help future diffing:
paths:
# Implementation anchors only — tla/**, Makefile, and the workflow
# self-reference from tla-check.yml are omitted (spec-only changes
# don't need AI impl review; tla-check.yml's TLC run is sufficient).
- 'kv/hlc*.go'What looks good
persist-credentials: falseaddition is clean and the comment explains the zizmor rationale ✅process.env.PR_NUMBER/process.env.HEAD_SHAviaenv:(not expression interpolation) — injection hardening intact ✅- ANCHORS regex
kv/hlc[^/]*\.gocorrectly matches thekv/hlc*.goglob ✅ always createComment(not update) logic and rationale are unchanged ✅count != '0'guard correctly skips the comment when no anchored files changed ✅- Four-way sync enumeration is consistent between YAML comment and doc ✅
- Parallel diagram accurately represents that both workflows fire concurrently ✅
Bottom line: Merge-ready. Items 1-3 are documentation nits worth a single follow-up commit if desired, but none are correctness issues.
Three documentation nits from claude bot's round 1 review of dcbc302 (claude bot said "Merge-ready"; these are the polish items it flagged that don't block but clean up future-maintainer confusion). - "copy of tla-check.yml" was imprecise. The path list in tla-spec-ai-review.yml is the **implementation-anchored subset** of tla-check.yml's paths — tla/**, Makefile, and the workflow self-references are intentionally omitted (spec-only changes do not need an AI implementation-divergence review). The file-header comment now states this directly. - "the list in this §7.3" was self-referential — §7.3 contains the 4-item enumeration but no canonical path list. Item 4 now points at the §3 implementation-anchors column as the canonical source, with an explicit note that the three workflow path lists are operational projections of that table. - Added an explanatory comment at the `paths:` block itself so a future maintainer diffing the two workflows sees the rationale inline instead of having to scroll to the file header. - §7.3 now spells out the delta between the two workflows (`tla/**`, `Makefile` watched by tla-check only) and notes that the implementation-anchor portion stays identical. actionlint clean. No functional change.
|
Round 2 polish in
@claude review |
|
Claude finished @bootjp's task in 2m 16s —— View job Review: Round 2 polish on
|
| # | Fix | Verification |
|---|---|---|
| 1 | File-header comment: "implementation-anchored subset" replaces "copy of tla-check.yml" | ✅ Line 22–27. Accurately reflects the delta: tla/**, Makefile, .github/workflows/tla-check.yml are absent from tla-spec-ai-review.yml — confirmed by diffing both workflow paths: blocks |
| 2 | §7.3 item 4: "§3 Implementation anchors table" replaces self-referential "the list in this §7.3" | ✅ Lines 600–604 of the doc. §3 (line 97) does have the Implementation anchors column, making the reference concrete and accurate |
| 3 | paths: block: inline comment added explaining "implementation anchors only" with rationale pointer |
✅ Lines 36–38 of the workflow. New Note paragraph at doc line 606–610 completes the picture by spelling out the exact delta |
Remaining observations (none block merge)
1. Line 573 of doc: "fires on the same path filter" is now slightly imprecise
A companion workflow `.github/workflows/tla-spec-ai-review.yml` fires on the same path
filter…
After Round 2, the doc carefully explains they are not the same (it's a subset). This intro sentence predates Round 2 and wasn't touched. The Note at lines 606–610 corrects the record, but a reader skimming the intro paragraph gets the wrong impression before reaching the Note.
One-word fix if desired (not blocking):
-fires on the same path filter
+fires on a related path filter (an implementation-anchored subset — see the Note below)2. ANCHORS comment item 4 (set in Round 1) still says "the enumeration in §7.3"
Lines 79–83 of the workflow:
# 4. the enumeration in §7.3 of the design doc
After Round 2, §7.3's item 4 redirects to the §3 table as the canonical source. The reference chain is: ANCHORS comment → §7.3 → §3 table — one indirect hop, but functional. Not worth a Round 3 commit on its own.
What looks good
- The path-filter delta claim at line 23 is correct:
tla/**,Makefile, and.github/workflows/tla-check.ymlall appear intla-check.ymland are absent fromtla-spec-ai-review.yml✅ - §3 table actually contains the
Implementation anchorscolumn (line 97), so pointing there as the canonical source is accurate ✅ - The Note paragraph (lines 606–610) precisely names the delta files and is consistent with the file-header comment ✅
paths:inline comment correctly says "three other sync points" (you're already at one of the four) ✅- All prior round fixes (MD040,
persist-credentials: false, parallel diagram, four-way enumeration,ready_for_reviewasymmetry) remain intact ✅ - No functional changes — pure prose/comment polish ✅
Merge-ready. Observations 1–2 are sub-nit level and do not warrant a Round 3 commit.
…#859) ## Summary Implements the **M1 spec follow-up** per [`docs/design/2026_05_28_partial_tla_safety_spec.md`](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_05_28_partial_tla_safety_spec.md) §5.1 HLC-4 precondition **(ii)** — strategy (c) "scan the FSM for max committed HLC and `Observe` before serving any persistence `Next()`". Follows PR #856 (M1 TLA+ spec). ## What lands One line in `kv/fsm.go` `applyRequestErr` after a successful `handleRequest`: ```go if f.hlc != nil && commitTS > 0 { f.hlc.Observe(commitTS) } ``` `HLC.Observe` is the existing CAS-to-max method (`kv/hlc.go:129-140`), so this is a **pure addition** with no signature change and no caller audit required. ## Why "observe on every apply" vs the spec doc's "observe on election" The spec doc describes strategy (c) as "on election, scan the FSM and call `Observe`". The implementation refines this to "Observe on every apply". The two are semantically equivalent for HLC-4 because etcd/raft applies all uncommitted prior-term entries before a new leader serves a write — so by the time `HLC.Next()` runs in the new term, `hlc.last >= every prior commit`. Two advantages over the on-election form: 1. **No protocol change** to `ShardedCoordinator` or `RunHLCLeaseRenewal`. 2. **No race window** between `BecomeLeader` and the first `Observe` — a concurrent client request landing in that window cannot see a stale `hlc.last`. ## Tests `kv/fsm_hlc_observe_test.go`: - `TestApplyObservesCommitTSIntoHLC` — verifies Apply advances `hlc.last` via `Observe(commitTS)`, and preserves monotonicity on a stale-ts apply. - `TestApplyHLCObserveAfterRestart` — simulates the spec's new-leader-after-restart scenario: a fresh HLC (`last = 0`) re-plays a prior leader's write through the FSM and the resulting `Next()` is strictly greater than the prior `commit_ts`. Both tests pass with `-race`. ## Deferred to a follow-up PR The **ceiling-fence half** of the M1 spec contract (HLC-4 precondition **(iii)** — making `HLC.Next()` fail-closed when `wall_now >= physicalCeiling`) is intentionally NOT in this PR. That change alters `HLC.Next()`'s return signature from `uint64` to `(uint64, error)` and requires auditing **18 production call sites** across `kv/`, `adapter/`, and `store/`. Splitting it from the strategy-(c) PR keeps each diff individually reviewable per the loop-discipline reminder for semantic changes. ## Self-review (5-lens per CLAUDE.md) 1. **Data loss** — n/a; `Observe` only advances an in-memory clock counter, no persistence path touched. 2. **Concurrency** — `Observe` is the existing CAS-to-max loop; safe under concurrent `Apply` (which is serial per Raft FSM contract anyway) and concurrent `Next` (the CAS handles it). 3. **Performance** — adds one atomic load + CAS to the FSM apply hot path. The CAS only fires when `ts > current` so steady-state is one load. 4. **Data consistency** — `Observe` is monotone (only advances). No semantic change to any other code path. 5. **Test coverage** — two new tests in `kv/fsm_hlc_observe_test.go` covering the basic apply→observe path and the post-restart catchup case. Existing kv + store tests pass. ## Test plan - [ ] CI green (the `tla-check` workflow does NOT fire — this PR is Go-only) - [ ] The new `tla-spec-ai-review` workflow (PR #857) auto-pings Claude + Codex for spec-divergence review - [ ] Reviewer cross-checks the Observe call against HLC-4 (ii) strategy (c) in the design doc and `BecomeLeader_HLC` in `tla/hlc/HLC.tla` ## Out of scope - HLC-4 precondition (iii) ceiling fence (`HLC.Next()` fail-closed) — separate follow-up PR. - Operator-visible metrics for the fence behaviour — lands with (iii).
Summary
Auto-trigger AI review (Claude + Codex) whenever a PR touches a file the TLA+ safety spec has an anchor on, so a spec-divergence review always fires — never relies on a human remembering to ping.
Follows PR #856 (M1 spec +
tla-checkworkflow).How it works
New workflow:
.github/workflows/tla-spec-ai-review.ymlpull_request(opened / synchronize / reopened / ready_for_review).tla-check.yml's — the two workflows stay paired so every anchored-file change triggers both the TLC run AND the AI-review request.git diffagainst the PR base) and skips the comment if none.@claude reviewand@codex reviewmentions, plus a per-subsystem checklist of which spec invariants to verify (HLC-4 preconditions, OCC lock encoding, MVCC visibility, Routes catalog versioning).createComment— neverupdateComment. The upstreamclaude.ymllistens onissue_comment: createdonly; editing would not re-trigger Claude. Same for Codex.Chain
The two workflows are complementary:
tla-checkverifies the spec passes; the AI reviewers verify the implementation matches the spec.Docs
§7.3 of
docs/design/2026_05_28_partial_tla_safety_spec.mdis updated with the chain diagram, the two-workflow split, and a reminder that the path filter must stay in sync across the two YAML files and the doc § list.Verification
actionlintclean.make tla-checkstill passes locally onmain(3,594 distinct states on safe config, HLC-4 counterexample at depth 5 on gap config).kv/hlc.go.Test plan
tla-checkruns again because the workflow path filter touchestla/**? No — this PR only adds.github/workflows/, which is NOT in the path filter, sotla-checkwill NOT fire on this PR. That's correct.)kv/hlc.goto verify the comment+bot chain firesOut of scope
<!-- tla-spec-ai-review-marker -->) so a future PR can add a "minimize older" step if the comment noise becomes a problem.tla-checkis sufficient).Summary by CodeRabbit
Documentation
Chores