Skip to content

tla: M5 — Composed.tla (Composed-1 + Composed-3) wired into tla-check#865

Merged
bootjp merged 2 commits into
mainfrom
feat/tla-m5-composed-spec
May 29, 2026
Merged

tla: M5 — Composed.tla (Composed-1 + Composed-3) wired into tla-check#865
bootjp merged 2 commits into
mainfrom
feat/tla-m5-composed-spec

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented May 29, 2026

Summary

M5 of the TLA+ safety spec — the cross-module safety properties, capping the spec-only roadmap (M1..M5).

M1..M4 each modelled a single subsystem in isolation. M5 captures the seams between them — the two safety properties that no single-module spec can express.

What lands

  • tla/composed/Composed.tla — recreates the minimal cross-module state needed for Composed-1 (catalog history routes[v], txn observed catalog version txnObservedVer[t], committing group txnCommitGroup[t]) and Composed-3 (monotonic tsCounter advancing on every Commit). Does NOT INSTANCE M1..M4 — that would explode the state space well past M5's <10 min at default bounds target from §8.1.
  • tla/composed/MCComposed.tla — TLC model with key + group + txn symmetry.
  • tla/composed/MCComposed.cfg / MCComposed_gap.cfg — safe (PASS) + gap (FAIL on Composed-1).
  • scripts/tla-check.shTLA_MODULES gains "composed"; gap_invariant_for + mc_basename gain the corresponding cases.
  • tla/README.md — M5 status Not startedLanded; Composed module description + invariant table.

Invariants

# Statement Form
Composed-1 Every committed write key was owned by the committing group at the txn's observed catalog version INVARIANT
Composed-2 Vacuously TRUE in this abstraction — SplitRange is same-group only per CLAUDE.md INVARIANT
Composed-3 Distinct committed txns have distinct commit_ts INVARIANT
Composed_CatalogMonotonic catalogVersion weakly increases on every step PROPERTY
Composed_TsMonotonic tsCounter weakly increases on every step PROPERTY
Composed3_TsAction Every Commit strictly raises tsCounter PROPERTY

Why no INSTANCE?

Each M1..M4 module tightly bounds its own state for tractable TLC. Bringing all four INSTANCEs into one product spec would multiply the state spaces — even a 100x growth lands well past M5's <10 min at default bounds target. M5 instead recreates the minimal cross-module state needed to express the two seam invariants. The integration claim is that this projection preserves the invariants the full product would assert; reviewers can sanity-check each Composed action against the corresponding M1..M4 action (e.g., ProposeRouteChange mirrors Routes.tla, BeginTxn + Commit mirror OCC.tla with a cross-module observed-catalog-version pin).

Verification (local)

$ make tla-check
HLC      safe 3,594 distinct states + HLC-4 gap ce
OCC      safe   150 distinct states + OCC-1 gap ce
MVCC     safe    79 distinct states + MVCC-4 gap ce
Routes   safe    29 distinct states + Routes-4 gap ce
Composed safe 1,684 distinct states + Composed-1 gap ce
tla-check: all model-check outcomes match the design contract.

The Composed-1 counterexample is the canonical 4-step regression:

  1. BeginTxn(t)txn observes catalogVersion = 0, where routes[0][k1] = g1.
  2. WriteIntent(t, k1).
  3. ProposeRouteChange(k1, g2)catalogVersion → 1, routes[1][k1] = g2.
  4. Commit(t, g2) (gap mode) — committing group is g2, but routes[0][k1] = g1 ≠ g2. Composed-1 fails.

Self-review (5-lens per CLAUDE.md)

  1. Data loss — out of scope (cross-module invariants check group ownership, not durability).
  2. Concurrency — Composed-1 is exactly the concurrency seam between OCC and Routes; the gap counterexample is one such interleaving.
  3. Performancemake tla-check end-to-end runs in ~10 s on a dev laptop, well under the 10-min target.
  4. Data consistency — Composed-1 (cross-module write/route consistency) + Composed-3 (cross-txn ts uniqueness) + 3 PROPERTY transitions.
  5. Test coverage — no Go tests added (spec-only). make tla-check is the coverage layer.

Roadmap status (post-merge)

Milestone Status
M1 HLC Landed (PR #856)
M2 OCC Landed (PR #858)
M3 MVCC Landed (PR #861)
M4 Routes Landed (PR #861 via stacked #862)
M5 Composed This PR
M6 liveness (OPTIONAL) Not started — can follow as a separate PR
M1 Go follow-up (ceiling fence) Not started — separate Go PR with caller audit

Test plan

  • tla-check CI runs green (watches tla/**)
  • tla-spec-ai-review does NOT fire — spec-only PR
  • Reviewer cross-checks Composed-1 / Composed-3 against §5.5 of the design doc
  • Reviewer runs make tla-check and confirms 10 outcomes (5 safe pass + 5 gap fail)
  • Reviewer sanity-checks the INSTANCE-free projection claim — each Composed action mirrors the corresponding M1..M4 action

Out of scope

  • M6 liveness (OCC-L1, Routes-L1) — separate PR if needed
  • HLC-4 (iii) ceiling fence Go implementation — separate code PR
  • Cross-group SplitRange in the real implementation (currently same-group only)

Per §8.1 of docs/design/2026_05_28_partial_tla_safety_spec.md (M5
deliverable): the cross-module safety properties — the two seam
invariants that no single-module spec (M1..M4) can express.

What lands:

- tla/composed/Composed.tla — Recreates the minimal cross-module
  state needed for Composed-1 (catalog history routes[v], txn
  observed catalog version txnObservedVer[t], committing group
  txnCommitGroup[t]) and Composed-3 (monotonic tsCounter advancing
  on every Commit).  Does NOT INSTANCE M1..M4 — that would explode
  the state space well past the <10 min target from §8.1; the M5
  projection preserves the invariants the full product would
  assert.
- tla/composed/MCComposed.tla — TLC model with key + group + txn
  symmetry.
- tla/composed/MCComposed.cfg — safe config (PASS).
- tla/composed/MCComposed_gap.cfg — gap config (FAIL on
  Composed1_CommitToOwningGroup).

Invariants encoded:

- Composed-1  every committed write key was owned by the committing
  group at the txn's observed catalog version.  Load-bearing cross-
  module invariant — ties OCC's commit decision to the Routes
  catalog snapshot the txn read at BeginTxn.
- Composed-2  vacuously TRUE in this abstraction because the
  implementation's SplitRange is same-group only (CLAUDE.md).
  ProposeRouteChange in M5 explicitly DOES allow cross-group moves
  so Composed-1 has teeth; Composed-2's cross-group migration
  clause is a forward-looking guard rail.
- Composed-3  distinct commit_ts across committed txns.  Holds by
  construction (monotonic tsCounter advancing on every Commit);
  PROPERTY Composed3_TsAction strengthens this to "every Commit
  raises the clock".
- Composed_CatalogMonotonic (PROPERTY)  catalogVersion weakly
  increases on every step.
- Composed_TsMonotonic (PROPERTY)  tsCounter weakly increases.

Tooling:
- scripts/tla-check.sh: TLA_MODULES gains "composed"; gap_invariant_for
  + mc_basename gain the corresponding cases.
- tla/README.md: M5 status Not started -> Landed; Composed module
  description + invariant table + MCComposed subsection.

Verified locally with `make tla-check`:
- HLC safe (3,594) + gap (HLC-4 ce at depth 5).
- OCC safe (150) + gap (OCC-1 ce at depth 5).
- MVCC safe (79) + gap (MVCC-4 ce at depth 3).
- Routes safe (29) + gap (Routes-4 ce at depth 3).
- Composed safe (1,684 distinct states) + gap (Composed-1 ce at
  depth 4: BeginTxn(t), WriteIntent(t, k1), ProposeRouteChange(k1, g2),
  Commit(t, g2) where routes[0][k1] = g1 # g2).

End-to-end runtime well under M5's <10 min target.

This is the cap on the spec-only roadmap.  Remaining work:
- M6 (liveness, OPTIONAL): OCC-L1 + Routes-L1 under fairness.  Can
  follow as a separate PR.
- M1 Go follow-up: HLC-4 (iii) ceiling fence — change HLC.Next() to
  fail-closed when wallNow >= physicalCeiling.  Separate code PR
  with caller audit of every Next() site.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

Warning

Review limit reached

@bootjp, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 20 minutes and 10 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 27cb26da-490f-41ec-a0d2-d3a32da449c5

📥 Commits

Reviewing files that changed from the base of the PR and between fd79546 and f8d97b3.

📒 Files selected for processing (8)
  • scripts/tla-check.sh
  • tla/README.md
  • tla/composed/Composed.tla
  • tla/composed/MCComposed.cfg
  • tla/composed/MCComposed.tla
  • tla/composed/MCComposed_gap.cfg
  • tla/mvcc/MCMVCC.tla
  • tla/occ/MCOCC.tla
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tla-m5-composed-spec

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the M5 milestone by introducing the composed TLA+ specification (Composed.tla) and its corresponding model-checking configurations to verify cross-module safety properties, specifically Composed-1 (committing to the owning group) and Composed-3 (strict serializability bound). It also integrates the new module into the tla-check.sh test harness and updates the documentation. The feedback points out a critical issue in MCComposed.tla where the Symmetry set is defined as a union of permutation sets, which is mathematically incorrect for TLC's symmetry reduction and can lead to unsound model checking. A code suggestion is provided to correctly define the symmetry group as a direct product of the symmetric groups.

Comment thread tla/composed/MCComposed.tla Outdated
KeySymmetry == Permutations(Keys)
GroupSymmetry == Permutations(Groups)
TxnSymmetry == Permutations(TxnIds)
Symmetry == KeySymmetry \cup GroupSymmetry \cup TxnSymmetry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Defining Symmetry as the union of permutation sets (KeySymmetry \cup GroupSymmetry \cup TxnSymmetry) is mathematically incorrect and can lead to unsoundness in TLC model checking. TLC's symmetry reduction algorithm assumes that the set of permutations specified in SYMMETRY forms a group under function composition. The union of symmetric groups on disjoint sets is not closed under composition (e.g., a permutation that permutes both keys and groups is not in the union), so it is not a group. This violates TLC's assumptions and can cause TLC to miss states or behave unpredictably. To fix this, define the symmetry group as the direct product of the symmetric groups using the function union operator @@ from the TLC module.

Symmetry      == { kSym @@ gSym @@ tSym : kSym \in KeySymmetry, gSym \in GroupSymmetry, tSym \in TxnSymmetry }

…i HIGH on PR #865)

Replaces `KeySymmetry \cup … \cup TxnSymmetry` (and the two analogous
two-set constructions in MCMVCC / MCOCC) with the direct product of
the per-domain permutation groups, built via the TLC `@@` (function
union) operator.

Gemini's HIGH-priority review on PR #865 flagged this in MCComposed:
TLC's `SYMMETRY` option requires the supplied set to be a *group* of
permutations, closed under composition. The set-theoretic union of
the disjoint per-domain permutation groups is NOT closed under
composition — a permutation that touches both Keys and Groups (for
example) is not in the union — so TLC's symmetry-reduction
algorithm can silently elide states and report a falsely-clean run.

The same construction was duplicated in:
  * tla/composed/MCComposed.tla (the gemini-flagged file)
  * tla/mvcc/MCMVCC.tla     (KeySymmetry \cup ValSymmetry)
  * tla/occ/MCOCC.tla       (TxnSymmetry \cup KeySymmetry)

All three are fixed in one commit so the symmetry contract is
consistent across the spec suite. The single-domain modules
(MCHLC, MCRoutes) already supplied a single `Permutations(S)` directly,
which is already a group, so they are unchanged.

The construction `{ kSym @@ … : kSym \in KeySymmetry, … }` works
because the per-domain constants are *disjoint* model values (Keys,
Vals, Groups, TxnIds, etc.), so `@@` over them is plain function
union — every element of the resulting set is a bijection on the
disjoint union of the constant sets, and the family is closed under
function composition (the product of independent permutations on
disjoint domains).

Verification: `make tla-check` post-fix —
  HLC      safe pass  + HLC-4 gap fail
  OCC      safe pass  + OCC-1 gap fail
  MVCC     safe pass  + MVCC-4 gap fail
  Routes   safe pass  + Routes-4 gap fail
  Composed safe pass (68 distinct states, depth 4) + Composed-1 gap fail
All ten outcomes match the design contract.

No spec semantics or invariant statements change — only the symmetry
hint expressed to TLC.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented May 29, 2026

Gemini HIGH addressed in f8d97b3:

  • MCComposed / MCMVCC / MCOCC all converted from \\cup of per-domain permutation sets to the direct-product group via @@ (function union over disjoint domains).
  • Rationale baked into the symmetry block comment so future contributors don't re-introduce the bug.
  • Cross-applied to MVCC and OCC because the same construction was duplicated there; single-domain MCHLC / MCRoutes are unaffected (a single Permutations(S) is already a group).
  • Verified: make tla-check still produces 5 safe PASS + 5 gap FAIL with all design-contract counterexamples intact.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 29, 2026

Claude finished @bootjp's task in 3m 36s —— View job


Code Review — M5 Composed.tla + Symmetry Fix (f8d97b3)

  • Load context and understand request
  • Read changed files
  • Verify symmetry group fix
  • Review Composed.tla spec logic
  • Review MCComposed.cfg / MCComposed_gap.cfg
  • Review tla-check.sh changes
  • Review tla/README.md updates

Symmetry fix (f8d97b3) — Correct ✅

The Gemini HIGH was correctly addressed. The fix in all three MC*.tla files:

\* OLD (wrong): union of per-domain permutation sets — not closed under composition
Symmetry == KeySymmetry \cup GroupSymmetry \cup TxnSymmetry

\* NEW (correct): direct product via @@  — closed under composition
Symmetry == { kSym @@ gSym @@ tSym :
                  kSym \in KeySymmetry,
                  gSym \in GroupSymmetry,
                  tSym \in TxnSymmetry }

Why the fix is sound:

  • @@ is function union. When domains are disjoint, kSym @@ gSym @@ tSym is a bijection on Keys \cup Groups \cup TxnIds.
  • The configs use distinct namespaces (k1/k2, g1/g2, t1/t2) so the domains are disjoint. Verified:
    • MCComposed.cfg: Keys={k1,k2}, Groups={g1,g2}, TxnIds={t1,t2}
    • MCOCC.cfg: Keys={k1,k2}, TxnIds={t1,t2}
    • MCMVCC.cfg: Keys={k1,k2}, Vals={v1,v2}
  • The resulting set has |Keys|! × |Groups|! × |TxnIds|! = 2!×2!×2! = 8 elements — exactly the direct product S₂ × S₂ × S₂, which is a group.
  • Single-domain modules (MCHLC, MCRoutes) correctly identified as unaffected — Permutations(S) already yields a group.
  • The comment at tla/composed/MCComposed.tla:14-33 is thorough and encodes the reasoning so future contributors don't re-introduce the bug. Well done.

One edge case to confirm is already correct: the sentinel NoGroup = "no_group" is a string literal, not a TLC model value in any of the three domains, so TLC leaves it unchanged when applying the permutations. Nothing to fix here.


Composed.tla spec logic — Sound ✅

Composed-1 (Composed1_CommitToOwningGroup):

\A t \in CommittedTxns :
    \A k \in txnWriteSet[t] :
        OwnerAt(txnObservedVer[t], k) = txnCommitGroup[t]

The invariant is stated precisely: it binds the routing check to txnObservedVer[t] (the catalog snapshot the txn read at BeginTxn), not the current catalogVersion. This is the correct semantic — it captures the TOCTOU window where a route change races with an in-flight transaction.

The Commit action enforces this at tla/composed/Composed.tla:184:

\/ \A k \in txnWriteSet[t] : OwnerAt(txnObservedVer[t], k) = g

The gap config removes this guard, TLC finds the canonical 4-step counterexample, and the harness validates the exact invariant string.

Composed-3 (Composed3_DistinctCommitTs):
Holds by construction — tsCounter is strictly incremented on every Commit (line 185: tsCounter' = tsCounter + 1). The additional Composed3_TsAction PROPERTY strengthens this to an action-level claim. The two together correctly decompose the serializability bound.

routes initialization and ProposeRouteChange:
routes initializes all versions 0..MaxVersions to InitGroup (line 99). ProposeRouteChange builds version catalogVersion+1 by copying from catalogVersion and overriding key k (line 125-126). This is correct — it means all versions above the current catalogVersion hold a "future placeholder" value, and ProposeRouteChange fills them in order. The guard catalogVersion < MaxVersions ensures catalogVersion+1 is always in range.

One design note (not a bug): Commit and Abort do not increment or guard opCount. This means terminal actions are unbounded — a transaction can always finalize regardless of opCount. The canonical 4-step CE needs 3 counted ops (BeginTxn + WriteIntent + ProposeRouteChange) + 1 uncounted Commit, which fits within MaxOps = 4. This is intentional and reasonable design, but it's implicit — the spec comment at line 172 explains the role of opCount < MaxOps in Commit only for tsCounter < MaxTs, not for opCount. Not a correctness issue, just a subtle choice worth being aware of.


Configs (MCComposed.cfg / MCComposed_gap.cfg) — Well-formed ✅

  • Bounds (MaxVersions=2, MaxTs=4, MaxOps=4, 2 keys, 2 groups, 2 txns) are tight enough for fast TLC and wide enough to cover the 4-step counterexample. The PR's 1,684-state result is consistent with these bounds.
  • CHECK_DEADLOCK FALSE is correct — quiescence (all txns terminal, opCount/tsCounter at bounds) is a valid terminal state, not a bug.
  • Both configs list all 3 PROPERTIES. In the gap run TLC will fail on Composed1_CommitToOwningGroup (an INVARIANT checked at every state) before any PROPERTY could be violated — the tla-check.sh harness greps for the exact invariant string to confirm this.

scripts/tla-check.sh — Correct ✅

  • "composed" added to TLA_MODULES in dependency order (after routes).
  • gap_invariant_for and mc_basename case statements correctly handle "composed""Invariant Composed1_CommitToOwningGroup is violated" / "MCComposed".
  • The TitleCase naming for MCComposed (vs MCHLC/MCOCC/MCMVCC) is already handled by the mc_basename switch — no regex-based uppercasing to worry about.
  • The harness's double-check pattern (both exit-code inversion AND grep match for the gap config) is already in place from prior PRs; no change needed here.

tla/README.md — Accurate ✅

M5 row updated to Landed. The Composed module description, invariant table, and counterexample trace are all consistent with the spec. The README correctly notes the INSTANCE-free projection rationale.


Summary

The symmetry fix is mathematically correct, applied consistently across all affected modules, and well-documented. The Composed.tla spec accurately captures the cross-module seam between OCC and Routes. No issues found — PR looks good to merge.

@bootjp bootjp merged commit c88f669 into main May 29, 2026
10 checks passed
@bootjp bootjp deleted the feat/tla-m5-composed-spec branch May 29, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant