test(core): add T2 stable id spec for parse-to-hf id contract (before R1)#1245
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
731bc78 to
c4540f8
Compare
0c2a999 to
ee015a5
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
T2 — clean spec/baseline split, [spec] prefix convention is clear. The "adding an element before existing ones does not change existing ids" test is the strongest one here and the most important contract to lock in before R1.
Two things to keep in mind for R1 implementation:
- ID length:
^hf-[a-z0-9]{4}$= ~1.68M combinations. Should be fine for typical compositions (<100 elements), but if brand-kit compositions can get large, consider 6 chars (^hf-[a-z0-9]{6}$= ~2.1B) for safety. - Same duplicate
maxEndTime/serializehelper as #1240 — extract topackages/core/src/parsers/test-utils.tsor similar before R1 adds more test files.
✅ Approve.
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Hey Vance — strongest test design in the stack. The red/green split is clean, the failure-mode comments are loaded, and the counter-shift test (#3) targets exactly the bug R1 is supposed to fix. No blockers. A few sharpening notes.
What I verified
Walked each test against "what could R1 silently break that this catches first":
Red (spec — intentionally failing until R1):
elements without an id get an hf- prefixed id at parse— pins R1's main contract. Load-bearing.generated hf- ids match /^hf-[a-z0-9]{4}$/— pins the format. Important because if R1 landshf-ids but uses 6 chars or includes uppercase, the spec test catches the format drift before downstream code (CSS selectors, SDK targeting) breaks on the assumption.prepending a new element does not change existing elements' ids(counter-shift bug) — the sharpest test in this PR. Today'selement-1/element-2counter-based scheme would fail this; R1's content-derived scheme should pass. The inline comment naming both behaviors (element-1→element-2= FAILS,hf-xxxx= PASSES) is excellent — it tells the next reader exactly what's being asserted and why.
Green (baseline — already pass, must not regress):
- Existing id preserved — pins R1 doesn't auto-rename when an id is already there. Critical for backwards-compat with hand-written compositions.
- Determinism (same input → same ids) — pins R1's id derivation is content-aware, not random. (See Concerns.)
- Uniqueness within document — pins the obvious safety property.
- Survives serialize → re-parse round-trip — pins ids are emitted in a way the parser re-reads identically. Without this, the round-trip suite in T1 (#1240) wouldn't be valid.
The .todo for sub-composition scoping (compositionId/hf-x) is correctly punted — it depends on the SDK session API.
Concerns
- The
^hf-[a-z0-9]{4}$format gives 36⁴ ≈ 1.68M IDs. For document-local uniqueness with typical composition sizes (tens of elements), collision probability is negligible if the hash is content-aware. But for the determinism test to pass and uniqueness to hold across, say, two identical empty<div>s, the hashing input has to include position or a sibling counter. Today's^hf-[a-z0-9]{4}$regex doesn't pin the input shape — it only pins the output shape. Worth flagging: if R1 uses a content+index hash, the spec is satisfied. If R1 uses pure content hash, two identical elements collide. The test currently doesn't differentiate; consider an assertion liketwo identical <div data-name="X"> elements get distinct hf- ids. - The cross-machine determinism question. Test 5 ("same input → same ids on re-parse") is in-process and would pass either with a content-hash or with random+memoization. If R1 uses
crypto.randomUUID+ a per-parse cache, the test passes locally but ids differ across machines / processes — which would silently break SDK ops that reference an id from one render in another. Worth one test asserting "running parseHtml in two completely fresh contexts produces the same ids" — e.g. via avi.resetModules()between calls, or by calling a fresh import.
Nits
- Test 2's filter
elements.filter((e) => !e.id.includes("stage"))— exists to skip the stage div itself. Could be more robust: use.filter((e) => e.id !== "stage")(exact match) so a future element withid="stage-overlay"doesn't get filtered out. Cheap. (nit) - The 4-char
[a-z0-9]{4}regex is tight. If R1 ever needs to widen (e.g. to 6 chars after a collision incident), every existing test asserting this regex needs updating. Consider^hf-[a-z0-9]{4,8}$to give R1 some room, or document explicitly that 4 chars is a hard contract. (nit) [spec]prefix in test names is great. Worth carrying that convention into the other 5 PRs in the stack for consistentvitest --grep '\[spec\]'filtering during R1 review. (nit, applies stack-wide)- No test for elements inside
<template>blocks. Sub-compositions live in templates; if R1's id assignment runs on template content, the green tests pass on flat documents but R1's id stability on nested compositions is uncovered. Out of scope (the.todomentions sub-comp scoping), but worth a note. (nit) - The
withPrependHTML in test 3 has a subtle whitespace difference frombase(the new element + indentation). Should not matter for the test (data-name lookup is exact), but if R1's hash includes any whitespace-sensitive content, the existing AlphaEl element would also shift. The current assertion catches this; just flagging that the test's signal depends on R1 normalizing whitespace before hashing. (nit)
Questions
- The 4-char id space (
[a-z0-9]{4}) — collision probability is ~50% at ~1300 elements per document (birthday bound). For typical compositions that's fine; for projects with hundreds of layers across many templates, collisions become non-trivial. Is there a plan for collision detection (re-hash on collision) or longer ids beyond a threshold? Worth pinning now even if not yet enforced. - "Stable across SDK ops" — if a
moveop produces a patch that adds a new element, does the new element get anhf-id that's stable on subsequent parses? The current tests cover parse-time and round-trip-time stability; mid-edit stability isn't covered. Probably T4 / R5 territory but worth confirming the boundary. - Header note says "tests 1, 2, 3" red and "tests 4, 5, 6, 7" green — would
// red — spec/// green — baselinecomments per test help vs the top-of-file numbering, which goes stale if anyone reorders? My read: the[spec]prefix already does this load — just keep using it. (already partially answered above)
What I didn't verify
- The HeyGenverse plan doc (HTTP 403 unauthenticated). Trusted the PR body's T2 framing and the R1 description.
- Current
parseHtmlbehavior on un-id'd elements — assumed it assignselement-Ncounter ids per the comment in test 3. - Whether the SDK's id contract requires
hf-ids globally or only at the studio-editor boundary. The tests asserthf-at parse, but if SDK ops can reference non-hf-ids (e.g. user-setid="my-anchor"), the targeting logic needs to accept both. - The hashing algorithm R1 plans to use (content-based, content+index, content+sibling-position) — see Concerns above.
— Review by Rames D Jusso
c4540f8 to
c317301
Compare
ee015a5 to
6d3df7d
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Building on Rames's review.
Rames's two identical elements collision concern is the one I'd flag as a blocker risk for R1: if two <div data-name="X"> elements appear in the same composition and R1 uses a pure content hash (no position component), they'd get the same id, failing the uniqueness contract. The existing uniqueness test uses elements with distinct data-name values (A, B, C) — it doesn't cover two identical elements. Add one test: two elements with identical markup get distinct hf- ids. If R1's hashing includes index/position, this passes trivially; if not, it's a P1 catch.
Cross-process determinism: the determinism test (test 5) runs two parseHtml calls in the same process, same module, same import. That passes with randomUUID + memo as easily as with a content hash. A cross-context assertion (via vi.resetModules() between calls, or a Worker boundary) would make this test actually pin the property Rames described (content-derived, not process-state-derived). Worth the effort given how load-bearing stable ids are for SDK targeting.
The !e.id.includes("stage") filter nit from Rames: straightforward change to e.id !== "stage". Make it before R1 since R1 will add hf- id elements that could plausibly include the string "stage" in a name.
c317301 to
eaba44c
Compare
6d3df7d to
85f4622
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Follow-up: both concerns addressed.
- ✅
!e.id.includes("stage")→e.id !== "stage"(exact match, no false positives) - ✅ New test: "two elements with identical markup get distinct ids" — pins that R1's hashing must include position/sibling-counter, not just content. This was the P1 collision risk.
✅ Re-approve.
|
Addressed in latest push:
|
eaba44c to
d5e1cac
Compare
85f4622 to
3a2e237
Compare
d5e1cac to
e1260cc
Compare
3a2e237 to
990da47
Compare
94fb3cd to
afa1ab4
Compare
990da47 to
0d2e471
Compare

T2 — Stable id spec (before R1)
Spec tests defining the
hf-id contract that R1 must satisfy. Written before the implementation so the refactor is test-driven and any regression is immediately visible.What this gates
R1 switches element targeting from CSS selectors (
#el-aaa) to stablehf-ids (hf-a1b2). Without stable ids, two sequential ops to the same element can silently target different elements if the document changes between them.Test split
Red (spec — intentionally failing until R1):
hf-prefixed id at parse/^hf-[a-z0-9]{4}$/Green (baseline — must not regress through R1):
File
packages/core/src/parsers/stableIds.test.ts(new)🤖 Generated with Claude Code