docs(spec): formula DSL extension survey — predicate gaps + decisions (closes #256)#285
Conversation
…closes #256) Surveys 10 predicate gaps in the current formula DSL, decides extend-pre-launch vs defer-post-launch per gap, locks the predicate floor for v1.0.0 baseline catalogs. Pre-launch (4 extensions, all cheap propositional logic): - G1: lt / lte / between (numeric range bounds) - G2: member_of (finite-set membership; closes PHP type-juggling gap) - G3: or (logical disjunction) - G4: not (negation) Deferred to post-launch (6 follow-up issues): - G5: structural typing (TS-impactful, research-grade) - G6: effect tracking (async / throws / IO, all-language) - G7: aliasing / ownership (rust / c / c++ / zig) - G8: dynamic dispatch (Python / Ruby; minor stdlib impact) - G9: temporal predicates (research-grade) - G10: fixed-width arithmetic (minor stdlib impact) No substrate-level changes (CDDL / Rust canonical / envelope shape) — G1-G4 are just adding constructors to each kit's slab DSL. Includes implementation plan: spec → rust reference → mirror across other 11 kits (parallelizable per the per-kit agent pattern), each baseline disclaimer addendum names deferred shapes per-language. Companion to #254 baseline rubric and #255 federation mechanism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 30 minutes and 21 seconds.Comment |
There was a problem hiding this comment.
Pull request overview
Adds a new survey doc intended to define which formula-DSL predicate gaps should be filled before v1.0.0 baseline catalogs ship, and which should be deferred. The doc is meant to guide follow-on spec/kit work, but it currently conflicts with the existing shared IR/spec surface in several important places.
Changes:
- Adds
docs/contributing/dsl-extension-survey.mdas a new baseline-planning/spec survey. - Records ten gap decisions, splitting work into pre-launch extensions vs post-launch deferrals.
- Proposes an implementation sequence and disclaimer-addendum template for baseline catalogs.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| What every kit's slab can express today (per `Slab.java`, `slab.h/c`, equivalents): | ||
|
|
||
| | Constructor / operator | Shape | Example | | ||
| |------------------------|-------|---------| | ||
| | `forall(binder, sort, body_fn)` | universal quantification | `forall("s", SORT_STRING, s => eq(ctor("len", s), num(0)))` | | ||
| | `eq(lhs, rhs)` | equality | `eq(ctor("type_of", x), strConst("int"))` | | ||
| | `gte(lhs, rhs)` | greater-than-or-equal | `gte(ctor("len", s), num(1))` | | ||
| | `ctor(op, args...)` | function / constructor application | `ctor("blake3_512", b)` | | ||
| | `num(n)` | integer constant | `num(64)` | | ||
| | `strConst(s)` | string constant | `strConst("ed25519:")` | | ||
| | `startsWith(haystack, prefix)` | string prefix check | `startsWith(s, strConst("blake3-512:"))` | | ||
|
|
||
| Sorts: only `SORT_STRING` is wired through. The DSL accepts additional sorts at the type level but no kit currently quantifies over them. | ||
|
|
||
| Operations (`ctor` ops): a kit-extensible set. The cross-kit shared subset includes `len`, `len_bytes`, `blake3_512`, `blake3_digest`, `jcs_encode_utf8`, `cbor_encode_*`, `ed25519_sign_*`, `proof_envelope_*`, etc. Each kit can add its own ops. | ||
|
|
||
| The DSL is shape-bound: it expresses type signatures, length bounds, identifier-prefix patterns, equality / lower-bound comparisons of constructed values. This is enough for most stdlib predicates that real programs depend on. | ||
|
|
||
| ## The gaps | ||
|
|
||
| For each gap: a description, an example builtin it blocks, the impact, and the decision (extend / defer). | ||
|
|
||
| ### G1: Numeric range bounds (lt, lte, between) | ||
|
|
||
| The DSL has `gte` but no `lt` / `lte` / `between(lo, hi)`. Bounds like "result is non-negative and fits in i64" can be expressed with `gte` only on one side. | ||
|
|
||
| **Example builtins blocked**: `String::len` returning a value in `[0, usize::MAX]`; `Math.floor()` returning something in `[INT64_MIN, INT64_MAX]`; `array_pop()` reducing length by exactly 1 (need `lte` to bound the change). | ||
|
|
||
| **Impact**: ~20% of stdlib builtins across all 12 languages have a natural upper bound that's currently inexpressible. | ||
|
|
||
| **Decision: EXTEND pre-launch.** Cheap addition (mirrors `gte`'s implementation), broad applicability, no substrate-level changes. Add `lt`, `lte`, `between(lo, hi)` to the formula DSL across all kits. | ||
|
|
||
| ### G2: Finite-set membership | ||
|
|
||
| No predicate for "x is in {a, b, c}" beyond chained `eq` with `or` (and there's no `or`). Useful for finite enums and coercion classes. | ||
|
|
||
| **Example builtins blocked**: PHP's falsy class — `is_falsy(x)` iff `x ∈ {false, 0, 0.0, "", "0", null, []}`. Without finite-set membership, the predicate is inexpressible. |
| **Example builtins blocked**: `String::len` returning a value in `[0, usize::MAX]`; `Math.floor()` returning something in `[INT64_MIN, INT64_MAX]`; `array_pop()` reducing length by exactly 1 (need `lte` to bound the change). | ||
|
|
||
| **Impact**: ~20% of stdlib builtins across all 12 languages have a natural upper bound that's currently inexpressible. | ||
|
|
||
| **Decision: EXTEND pre-launch.** Cheap addition (mirrors `gte`'s implementation), broad applicability, no substrate-level changes. Add `lt`, `lte`, `between(lo, hi)` to the formula DSL across all kits. |
| No predicate for "x is in {a, b, c}" beyond chained `eq` with `or` (and there's no `or`). Useful for finite enums and coercion classes. | ||
|
|
||
| **Example builtins blocked**: PHP's falsy class — `is_falsy(x)` iff `x ∈ {false, 0, 0.0, "", "0", null, []}`. Without finite-set membership, the predicate is inexpressible. | ||
|
|
||
| **Impact**: PHP baseline is partial without it (~15-20 builtins involve type juggling). Smaller impact on other languages. | ||
|
|
||
| **Decision: EXTEND pre-launch.** Cheap addition. The implementation is `member_of(value, [a, b, c, ...])` — array of constants on the right. Closes the PHP type-juggling gap at floor density. |
|
|
||
| Each is a per-kit code addition to the formula DSL. The shape across kits should match (cross-kit byte-equivalence depends on it). Suggested order: | ||
|
|
||
| 1. **Spec the new constructors** in `protocol/specs/` — JCS Value tree shape for `lt`, `lte`, `between`, `member_of`, `or`, `or_n`, `not`. Land as a contract memento in the catalog format spec. (~1 PR) |
|
|
||
| 1. **Spec the new constructors** in `protocol/specs/` — JCS Value tree shape for `lt`, `lte`, `between`, `member_of`, `or`, `or_n`, `not`. Land as a contract memento in the catalog format spec. (~1 PR) | ||
|
|
||
| 2. **Reference impl in rust** — extend `provekit-self-contracts/src/lib.rs` slab DSL helpers. Mint a test catalog using each new constructor. Validate byte-equivalence stays clean. (~1 PR) |
| ## What this rubric is NOT | ||
|
|
||
| - It is not a full type system. The DSL captures operational predicates programs rely on; it doesn't model the full semantics of any language. | ||
| - It is not the predicate ceiling. It's the floor for v1.0.0; future DSL extensions can land per the deferred-issue follow-ups. | ||
| - It is not language-agnostic. Each language has predicate shapes only it cares about; this rubric covers the cross-cutting gaps. |
… builtins (closes TSavo#257) First-mover pilot validating the per-language foundation-baseline workflow locked by TSavo#254 (rubric), TSavo#255 (federation), TSavo#256 (DSL survey). Adds: - `implementations/rust/provekit-baseline-rust-std/` — orchestrator crate with seven slabs (string, vec, option, result, slice, hashmap, iter) authoring 157 ContractDecls across 58 distinct std::* builtins. Each builtin meets the rubric's 2-predicate floor (type signature + determinism); aspirational structural predicates added where the current DSL surface (forall/eq/gte/ctor/num/strConst) reaches. - `src/bin/mint-rust-std-baseline.rs` — orchestrator binary signs with foundation v0 ed25519 seed, asserts byte-determinism, enforces rubric §"Compliance checklist" floors before exit. - Disclaimer ships as a `kind=disclaimer` v1.2 layered memento member of the proof envelope; envelope metadata pins disclaimer_cid + baseline.{version,language,language_version,kit_version} per §3. - `.provekit/baselines/rust-std-baseline-v1.proof` — the signed catalog (275 KB; bundle CID `blake3-512:60dc813e…14ec`, contractSetCid `blake3-512:76c278af…d5e4`). - `docs/baselines/README.md` + `docs/baselines/rust.md` — per-rubric index + per-language addendum (steward: rust-lang team; signature available: no; gaps: G6 effect tracking, G7 aliasing). - Regression tests pin contractSetCid + disclaimer_cid; substrate-load test asserts the verifier round-trips every member with zero load errors (the AC "substrate verifies the bytes" gate). DSL surface honored: only forall/eq/gte/ctor/num/strConst per TSavo#285's pre-launch lock. G1-G4 (lt/lte/between/member_of/or/not) land in a follow-up PR after the cross-kit DSL extension. G5-G10 are research-grade and explicitly deferred in the rust addendum. Foundation v0 seed unchanged ([0x42; 32]); other kits untouched; catalog CIDs unchanged. Workspace gains one new member. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Surveys 10 predicate gaps in the formula DSL, decides extend-pre-launch vs defer-post-launch per gap, locks the predicate floor for v1.0.0. Closes #256.
Pre-launch (4 extensions, all cheap)
After G1-G4 the DSL covers full propositional logic over equality / range / membership predicates. No substrate-level changes (CDDL, Rust canonical, envelope shape) — just adding constructors to each kit's slab DSL.
Deferred post-launch (6 follow-up issues)
Each deferred gap gets a follow-up issue and an explicit note in the relevant per-language baseline disclaimer addendum.
Implementation plan for G1-G4
Companion to
Three blocker specs all done. Unblocks rust pilot (#257) and the 12 per-kit baselines.
Test plan
🤖 Generated with Claude Code