Skip to content

docs(spec): formula DSL extension survey — predicate gaps + decisions (closes #256)#285

Merged
TSavo merged 1 commit into
mainfrom
spec/dsl-extension-survey
May 4, 2026
Merged

docs(spec): formula DSL extension survey — predicate gaps + decisions (closes #256)#285
TSavo merged 1 commit into
mainfrom
spec/dsl-extension-survey

Conversation

@TSavo
Copy link
Copy Markdown
Owner

@TSavo TSavo commented May 4, 2026

Summary

Surveys 10 predicate gaps in the formula DSL, decides extend-pre-launch vs defer-post-launch per gap, locks the predicate floor for v1.0.0. Closes #256.

Pre-launch (4 extensions, all cheap)

Gap Constructor Why pre-launch
G1 `lt` / `lte` / `between` ~20% of stdlib has natural upper bounds; mirrors existing `gte`
G2 `member_of` Closes PHP type-juggling gap; finite-set predicate
G3 `or` / `or_n` Cross-cutting (nullable returns); pairs with G1/G2
G4 `not` Trivial; completes propositional logic

After G1-G4 the DSL covers full propositional logic over equality / range / membership predicates. No substrate-level changes (CDDL, Rust canonical, envelope shape) — just adding constructors to each kit's slab DSL.

Deferred post-launch (6 follow-up issues)

Gap Decision Per-language disclaimer impact
G5: structural typing DEFER TS baseline names gap; ~30+ ts builtins impacted
G6: effect tracking DEFER All-language disclaimers; ~15-20% impact
G7: aliasing / ownership DEFER Systems-lang disclaimers (rust / c / c++ / zig)
G8: dynamic dispatch DEFER Minor stdlib impact (Python / Ruby)
G9: temporal predicates DEFER Research-grade
G10: fixed-width arith DEFER Minor stdlib impact

Each deferred gap gets a follow-up issue and an explicit note in the relevant per-language baseline disclaimer addendum.

Implementation plan for G1-G4

  1. Spec new constructors (JCS Value tree shape) → contract memento in catalog spec
  2. Rust reference implementation
  3. Mirror across other 11 kits (parallelizable per the per-kit agent pattern)
  4. Per-kit baseline catalogs use the extended DSL; rust pilot ([kit:rust] rust-std-baseline pilot — first-mover, validates the catalog workflow #257) validates

Companion to

Three blocker specs all done. Unblocks rust pilot (#257) and the 12 per-kit baselines.

Test plan

  • Doc-only — no code change in this PR
  • G1-G4 implementation lands as separate per-kit PRs after this rubric is locked

🤖 Generated with Claude Code

…closes #256)

Surveys 10 predicate gaps in the current formula DSL, decides
extend-pre-launch vs defer-post-launch per gap, locks the predicate
floor for v1.0.0 baseline catalogs.

Pre-launch (4 extensions, all cheap propositional logic):
  - G1: lt / lte / between (numeric range bounds)
  - G2: member_of (finite-set membership; closes PHP type-juggling gap)
  - G3: or (logical disjunction)
  - G4: not (negation)

Deferred to post-launch (6 follow-up issues):
  - G5: structural typing (TS-impactful, research-grade)
  - G6: effect tracking (async / throws / IO, all-language)
  - G7: aliasing / ownership (rust / c / c++ / zig)
  - G8: dynamic dispatch (Python / Ruby; minor stdlib impact)
  - G9: temporal predicates (research-grade)
  - G10: fixed-width arithmetic (minor stdlib impact)

No substrate-level changes (CDDL / Rust canonical / envelope shape) —
G1-G4 are just adding constructors to each kit's slab DSL.

Includes implementation plan: spec → rust reference → mirror across
other 11 kits (parallelizable per the per-kit agent pattern), each
baseline disclaimer addendum names deferred shapes per-language.

Companion to #254 baseline rubric and #255 federation mechanism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 4, 2026 14:40
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Warning

Rate limit exceeded

@TSavo has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 30 minutes and 21 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 290bf043-32ce-487d-bbce-81504dd61842

📥 Commits

Reviewing files that changed from the base of the PR and between 61305d0 and 48c16df.

📒 Files selected for processing (1)
  • docs/contributing/dsl-extension-survey.md
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spec/dsl-extension-survey

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 30 minutes and 21 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new survey doc intended to define which formula-DSL predicate gaps should be filled before v1.0.0 baseline catalogs ship, and which should be deferred. The doc is meant to guide follow-on spec/kit work, but it currently conflicts with the existing shared IR/spec surface in several important places.

Changes:

  • Adds docs/contributing/dsl-extension-survey.md as a new baseline-planning/spec survey.
  • Records ten gap decisions, splitting work into pre-launch extensions vs post-launch deferrals.
  • Proposes an implementation sequence and disclaimer-addendum template for baseline catalogs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +13 to +49
What every kit's slab can express today (per `Slab.java`, `slab.h/c`, equivalents):

| Constructor / operator | Shape | Example |
|------------------------|-------|---------|
| `forall(binder, sort, body_fn)` | universal quantification | `forall("s", SORT_STRING, s => eq(ctor("len", s), num(0)))` |
| `eq(lhs, rhs)` | equality | `eq(ctor("type_of", x), strConst("int"))` |
| `gte(lhs, rhs)` | greater-than-or-equal | `gte(ctor("len", s), num(1))` |
| `ctor(op, args...)` | function / constructor application | `ctor("blake3_512", b)` |
| `num(n)` | integer constant | `num(64)` |
| `strConst(s)` | string constant | `strConst("ed25519:")` |
| `startsWith(haystack, prefix)` | string prefix check | `startsWith(s, strConst("blake3-512:"))` |

Sorts: only `SORT_STRING` is wired through. The DSL accepts additional sorts at the type level but no kit currently quantifies over them.

Operations (`ctor` ops): a kit-extensible set. The cross-kit shared subset includes `len`, `len_bytes`, `blake3_512`, `blake3_digest`, `jcs_encode_utf8`, `cbor_encode_*`, `ed25519_sign_*`, `proof_envelope_*`, etc. Each kit can add its own ops.

The DSL is shape-bound: it expresses type signatures, length bounds, identifier-prefix patterns, equality / lower-bound comparisons of constructed values. This is enough for most stdlib predicates that real programs depend on.

## The gaps

For each gap: a description, an example builtin it blocks, the impact, and the decision (extend / defer).

### G1: Numeric range bounds (lt, lte, between)

The DSL has `gte` but no `lt` / `lte` / `between(lo, hi)`. Bounds like "result is non-negative and fits in i64" can be expressed with `gte` only on one side.

**Example builtins blocked**: `String::len` returning a value in `[0, usize::MAX]`; `Math.floor()` returning something in `[INT64_MIN, INT64_MAX]`; `array_pop()` reducing length by exactly 1 (need `lte` to bound the change).

**Impact**: ~20% of stdlib builtins across all 12 languages have a natural upper bound that's currently inexpressible.

**Decision: EXTEND pre-launch.** Cheap addition (mirrors `gte`'s implementation), broad applicability, no substrate-level changes. Add `lt`, `lte`, `between(lo, hi)` to the formula DSL across all kits.

### G2: Finite-set membership

No predicate for "x is in {a, b, c}" beyond chained `eq` with `or` (and there's no `or`). Useful for finite enums and coercion classes.

**Example builtins blocked**: PHP's falsy class — `is_falsy(x)` iff `x ∈ {false, 0, 0.0, "", "0", null, []}`. Without finite-set membership, the predicate is inexpressible.
Comment on lines +39 to +43
**Example builtins blocked**: `String::len` returning a value in `[0, usize::MAX]`; `Math.floor()` returning something in `[INT64_MIN, INT64_MAX]`; `array_pop()` reducing length by exactly 1 (need `lte` to bound the change).

**Impact**: ~20% of stdlib builtins across all 12 languages have a natural upper bound that's currently inexpressible.

**Decision: EXTEND pre-launch.** Cheap addition (mirrors `gte`'s implementation), broad applicability, no substrate-level changes. Add `lt`, `lte`, `between(lo, hi)` to the formula DSL across all kits.
Comment on lines +47 to +53
No predicate for "x is in {a, b, c}" beyond chained `eq` with `or` (and there's no `or`). Useful for finite enums and coercion classes.

**Example builtins blocked**: PHP's falsy class — `is_falsy(x)` iff `x ∈ {false, 0, 0.0, "", "0", null, []}`. Without finite-set membership, the predicate is inexpressible.

**Impact**: PHP baseline is partial without it (~15-20 builtins involve type juggling). Smaller impact on other languages.

**Decision: EXTEND pre-launch.** Cheap addition. The implementation is `member_of(value, [a, b, c, ...])` — array of constants on the right. Closes the PHP type-juggling gap at floor density.

Each is a per-kit code addition to the formula DSL. The shape across kits should match (cross-kit byte-equivalence depends on it). Suggested order:

1. **Spec the new constructors** in `protocol/specs/` — JCS Value tree shape for `lt`, `lte`, `between`, `member_of`, `or`, `or_n`, `not`. Land as a contract memento in the catalog format spec. (~1 PR)

1. **Spec the new constructors** in `protocol/specs/` — JCS Value tree shape for `lt`, `lte`, `between`, `member_of`, `or`, `or_n`, `not`. Land as a contract memento in the catalog format spec. (~1 PR)

2. **Reference impl in rust** — extend `provekit-self-contracts/src/lib.rs` slab DSL helpers. Mint a test catalog using each new constructor. Validate byte-equivalence stays clean. (~1 PR)
Comment on lines +172 to +176
## What this rubric is NOT

- It is not a full type system. The DSL captures operational predicates programs rely on; it doesn't model the full semantics of any language.
- It is not the predicate ceiling. It's the floor for v1.0.0; future DSL extensions can land per the deferred-issue follow-ups.
- It is not language-agnostic. Each language has predicate shapes only it cares about; this rubric covers the cross-cutting gaps.
@TSavo TSavo merged commit 511a7f0 into main May 4, 2026
8 checks passed
ConstCorrectness pushed a commit to ConstCorrectness/provekit that referenced this pull request May 6, 2026
… builtins (closes TSavo#257)

First-mover pilot validating the per-language foundation-baseline workflow
locked by TSavo#254 (rubric), TSavo#255 (federation), TSavo#256 (DSL survey). Adds:

- `implementations/rust/provekit-baseline-rust-std/` — orchestrator crate
  with seven slabs (string, vec, option, result, slice, hashmap, iter)
  authoring 157 ContractDecls across 58 distinct std::* builtins. Each
  builtin meets the rubric's 2-predicate floor (type signature +
  determinism); aspirational structural predicates added where the
  current DSL surface (forall/eq/gte/ctor/num/strConst) reaches.
- `src/bin/mint-rust-std-baseline.rs` — orchestrator binary signs with
  foundation v0 ed25519 seed, asserts byte-determinism, enforces
  rubric §"Compliance checklist" floors before exit.
- Disclaimer ships as a `kind=disclaimer` v1.2 layered memento member
  of the proof envelope; envelope metadata pins disclaimer_cid +
  baseline.{version,language,language_version,kit_version} per §3.
- `.provekit/baselines/rust-std-baseline-v1.proof` — the signed catalog
  (275 KB; bundle CID `blake3-512:60dc813e…14ec`, contractSetCid
  `blake3-512:76c278af…d5e4`).
- `docs/baselines/README.md` + `docs/baselines/rust.md` — per-rubric
  index + per-language addendum (steward: rust-lang team; signature
  available: no; gaps: G6 effect tracking, G7 aliasing).
- Regression tests pin contractSetCid + disclaimer_cid; substrate-load
  test asserts the verifier round-trips every member with zero load
  errors (the AC "substrate verifies the bytes" gate).

DSL surface honored: only forall/eq/gte/ctor/num/strConst per TSavo#285's
pre-launch lock. G1-G4 (lt/lte/between/member_of/or/not) land in a
follow-up PR after the cross-kit DSL extension. G5-G10 are
research-grade and explicitly deferred in the rust addendum.

Foundation v0 seed unchanged ([0x42; 32]); other kits untouched;
catalog CIDs unchanged. Workspace gains one new member.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spec] formula DSL extension survey — what predicates can't current DSL express

2 participants