RFC: ADR 0011 — interaction guarantee contract (path × guarantee matrix)#1080
Conversation
Size Report
Startup median (7 runs, lower is better):
Top changed chunks: no changes in the largest emitted chunks. |
…and gate Design for making interaction guarantees hold across every dispatch path (runtime selector/ref, direct iOS selector, native ref, coordinate, maestro fallback) instead of eroding at path boundaries one incident at a time — every interaction bug this week was a (path, guarantee) cell nobody was watching. Three layers (ADR 0011): declare the path x guarantee matrix as a typed registry whose completeness is a compile error; share one implementation per rule on both sides of the wire with golden fixture tables proving TS/Swift parity; prove every non-waived cell with contract scenarios generated from the registry. This lands Layer 1: the registry with an HONEST initial classification — ten cells are acknowledged gap waivers (direct-path disambiguation/ occlusion/nonHittable/responseFields/errorTaxonomy, native-ref guards, coordinate bounds) — plus the gate test that keeps entries truthful: referenced TS symbols must be exported, runner symbols must exist in the Swift sources, delegations must land on paths that actually enforce the guarantee, and the gap list is pinned so it can only change explicitly in a reviewed diff.
4bb8fda to
a99a303
Compare
- Frame Layer 1 as an honesty/completeness gate, not a truth gate: it proves every path declared a stance and referenced symbols exist; behavioral parity starts with the Layer-2/3 fixture and scenario work. - Split responseFields into responseConstruction (one shared response construction site — a single Layer-2 refactor) and responseIdentity (which identity fields a path can provide — per-path capability work); note the anticipated errorTaxonomy split (codes vs diagnostics). - Encode the hybrid gap-closure strategy: runner-side parity for geometry-local rules, delegation-on-error for semantic failures (with the explicit caveat that delegation-on-error is NOT success-path parity), and a shared runtime preflight for native-ref where a silent backend success means delegation never triggers. - Gap waivers now require a trackingIssue (gate-enforced URL); all 16 pinned gaps link the umbrella issue #1081. The honest reclassification grew the pin list from 10 to 16 — responseConstruction is a gap on every path including runtime ones, which is exactly the partial progress the coarser guarantee was hiding. - Align ADR wording with the code: parityTable is optional until Layer 3, required once a runner cell claims parity.
|
Design review applied in e43ce24 (review given out-of-band; recording the disposition here):
Marked ready for review. Status stays |
|
Review pass after latest changes: the revised framing is much stronger, and the Layer-1 gate passes locally on I found two things I would address before treating the registry as review-ready:
No other blockers from the ADR/registry/test pass. The issue-linking and responseConstruction/responseIdentity split addressed the earlier design concerns well. |
…d-scoped verify 1. maestro-non-hittable-fallback/disambiguation was overclaimed: the guarantee is defined as visible-first/deepest/smallest ranking, but findElement only implements unique-or-ambiguous scanning. Reclassified as an intentional waiver (deliberate Maestro-semantics divergence), mirroring how the direct path keeps its success-path parity gap. 2. verifyEvidence was claimed path-wide on paths that dispatch longpress, which has no --verify. Cells can now be command-scoped via appliesTo (non-empty strict subset of the path's commands, gate-enforced), and the three affected cells scope to press/click/fill.
|
Both findings addressed in the latest commit:
Gate is now 8 tests; suites/fallow/typecheck green. |
|
Re-review on latest head
Validation: |
|
Stacked on #1075#1075 merged — this branch is now a single commit on main (ADR + registry + gate only). The gate test passing post-rebase confirms every symbol the registry references survived the squash-merge.Why
Every interaction bug this week was the same shape: a guarantee enforced on one dispatch path and silently absent on a sibling. Offscreen refusal on
@refbut not selectors and not the runner's native tap (#1075);--verifyevidence onpress @refbut dropped byfill @ref's hand-built response (#1064); adb classification on thrown errors but notallowFailureresults (#1067); wait budgets honored for snapshot/replay envelopes but notwaititself, which additionally sat on the daemon-reset timeout path (#1075). Bandaids fix cells. Nothing watches the matrix.What
ADR 0011 (in this PR) proposes three layers; this PR lands Layer 1 so the design is falsifiable rather than aspirational:
src/contracts/interaction-guarantees.ts: 6 dispatch paths × 7 guarantees, every cell classified asruntime(points at the shared TS symbol),runner(Swift symbol + future parity table),delegated(e.g. direct path → runtime on--verifyand onELEMENT_OFFSCREEN),inapplicable, orwaived(reason). Completeness is a compile error (Record over the guarantee union).interaction-guarantees.test.ts:viasymbols must actually be exported (TS) or present (Swift sources); delegations must land on a path that enforces the guarantee; waivers need substantive reasons; and thegap:list is pinned — it can only change in a reviewed diff that edits the pin. The initial classification is honest: 10 acknowledged gaps, mostly on the direct-iOS and native-ref fast paths (no occlusion check, no nonHittable promotion/annotation, missing refLabel/selectorChain, error shapes without hints).buildInteractionResponseDataconstruction site + hand-rolled-literal guard; golden JSON fixture tables consumed by vitest AND the runner's Swift unit tests AND the provider harness's fake runner (cross-language parity without simulators); a contract scenario suite whose coverage of non-waived cells is generated from the registry; and descriptor-declared timeout policy replacing the two hand-maintained client lists.The pattern is deliberately the one this repo already trusts: the integration-progress flag gate (which caught #1064's unclassified
--verify) and the ADR-0009 apple-leak guard — make the gap declare itself.Ask
Design review of the ADR — especially: (a) the guarantee list (right granularity?), (b) whether the direct-path gaps should be closed by delegation-on-error (cheap, one more round trip on failure only) vs Swift parity (fast, more machinery), (c) whether the pinned gap list should require linked issues from day one.