Skip to content

fix(bundler): hoist duplicate titled enums to $defs in bundled schemas#3170

Merged
bokelley merged 6 commits into
mainfrom
claude/issue-3145-hoist-duplicate-enums
Apr 27, 2026
Merged

fix(bundler): hoist duplicate titled enums to $defs in bundled schemas#3170
bokelley merged 6 commits into
mainfrom
claude/issue-3145-hoist-duplicate-enums

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Refs #3145

Adds hoistDuplicateInlineEnums() post-processing step in scripts/build-schemas.cjs. After resolveRefs inlines every $ref, the same pure-enum schema can appear at multiple paths in the bundled output — json-schema-to-typescript sees two structurally identical inline shapes and emits AgeVerificationMethod + AgeVerificationMethod1, making the generated type look like a versioned duplicate that doesn't exist.

What the fix does: Two-pass algorithm — (1) walk the bundled schema collecting titled pure-enum schemas (type === 'string', Array.isArray(enum), keys.length <= 4) that appear at 2+ distinct non-$defs locations; (2) hoist each to root $defs and replace every inline occurrence with a $ref pointer. Untitled enums are left inline — no meaningful PascalCase name can be derived, and InlineEnumN is worse than the status quo.

Scope (per @bokelley's decision in #3145): Pure-enum schemas only. Complex object schemas (BriefAsset1, VASTAsset1, DAASTAsset1, CatalogAsset1) are intentionally out of scope — those type lineages are sometimes intentionally distinct even when fields align; hoisting them globally creates cross-tool coupling the source schemas don't express. An RFC with an x-hoist: true opt-in marker is the right path for those.

Non-breaking justification: The bundled schemas are generated artifacts, not wire format. No AdCP message payload contains the bundled schema blob. Changing inline enum shapes to $ref pointers in the bundled output is a type-generation concern only — validators produce identical accept/reject decisions either way.

$defs in draft-07 schemas: The bundled schemas declare "$schema": "http://json-schema.org/draft-07/schema#" while the hoist writes to $defs (a 2019-09 keyword). This is pre-existing behavior established by hoistNestedDefsToRoot (issue #2648) — the repo's Ajv instance runs strict: false and json-schema-to-typescript v10+ resolves $defs regardless of $schema. This diff does not introduce a new violation class.

Result (verified against latest bundled output):

  • media-buy/update-media-buy-request.json: AgeVerificationMethod inline x2 → $defs entry + 2× $ref
  • core/tasks-get-response.json: same fix
  • All other bundled schemas: no change (enums appear once → left inline)

SDK note: Consumers who already generated types from a bundled snapshot that emitted AgeVerificationMethod1 will need the alias re-export (AgeVerificationMethod as AgeVerificationMethod1) in adcp-client for one minor version. That work lives in adcp-client#942, not in this diff.


Pre-PR review:

  • code-reviewer: approved after two blocker fixes — (1) collect array branch now checks array elements directly (symmetry with replace branch); (2) fingerprint uses order-preserving enum array (sorting would silently merge semantically distinct enums sharing values); (3) empty-title guard added. One nit (pre-populate rootDefs before replace pass) incorporated.
  • ad-tech-protocol-expert: approved — non-breaking per JSON Schema spec; $defs-in-draft-07 pattern is pre-existing and tolerated by Ajv strict: false and json-schema-to-typescript v10+; fingerprint no-sort is correct; hoistNestedDefsToRoothoistDuplicateInlineEnums ordering is sound.

Changeset: --empty (no protocol bump — bundler script change, schema wire format and validation semantics unchanged).

Session: https://claude.ai/code/session_01SfY8L5D1NJJd6oiZqr74KS


Generated by Claude Code

Adds hoistDuplicateInlineEnums() post-processing step in
scripts/build-schemas.cjs. After resolveRefs inlines every \$ref,
the same pure-enum schema can appear at multiple paths in the bundled
output. json-schema-to-typescript sees two structurally identical inline
shapes and emits Foo + Foo1, creating the AgeVerificationMethod1
numbered-suffix codegen artifact.

The fix: two-pass algorithm — collect titled pure-enum schemas
(type === 'string', Array.isArray(enum), keys <= 4) appearing 2+ times,
then replace every occurrence with \$ref to a root \$defs entry. Untitled
enums are left inline (no meaningful name to derive). Complex objects
deferred to RFC on x-hoist markers. See #3145.

https://claude.ai/code/session_01SfY8L5D1NJJd6oiZqr74KS
@bokelley
Copy link
Copy Markdown
Contributor Author

Independent post-PR review (code-reviewer + ad-tech-protocol-expert)

Two reviewers from the SDK side took a fresh pass. Both converge on the same blocker: $defs should be definitions. Plus two adjacent issues worth fixing before merge.

Required before merge

1. Keyword: $defsdefinitions (1-line-style fix, 3 sites in scripts/build-schemas.cjs)

Bundled schemas declare "$schema": "http://json-schema.org/draft-07/schema#". $defs is a draft 2019-09 keyword — strictly draft-07-compliant validators (Python jsonschema Draft7Validator, Go santhosh-tekuri/jsonschema v3 in draft-07 mode, Ajv with strict: true) treat it as an unknown keyword and silently ignore it, so #/$defs/AgeVerificationMethod fails to resolve and validation crashes. Existing bundled output already uses definitions (verified at schemas/cache/3.0.0/bundled/media-buy/list-creative-formats-response.json:4038) — the new function should match.

Three replacements in the new function:

  • schema.\$defs = rootDefsschema.definitions = rootDefs
  • \#/$defs/${...}``#/definitions/${...}`` (both replace-pass occurrences)
  • Object.keys(schema.\$defs || {})Object.keys(schema.definitions || {}) (collision seed)

The key === '\$defs' || key === 'definitions' filters during collect/replace are correct as-is and should stay.

2. Changeset frontmatter is empty

```


```

@changesets/cli treats this as a no-op — no version calculation, no publish. Needs "adcontextprotocol": patch (or the right package name + bump) inside the frontmatter, otherwise the fix doesn't actually ship.

3. Add at least one regression test

Bundler is critical infrastructure with zero test coverage on this new pass. ~30 lines of vitest covering: titled enum referenced 3× → assert definitions.Foo + three $refs; untitled-stays-inline; single-occurrence-stays-inline; name-collision-with-existing-def. Prevents the next silent regression.

Should fix or justify

4. Fingerprint loses per-site description silently. JSON.stringify({type, enum}) ignores description, enumDescriptions, default, title. Two enums with identical values but per-site description collapse to first-wins. Pick one: include description in the fingerprint (more Foo2 names but no doc loss); or explicitly null out site-level description before storing in definitions (makes the drop intentional and visible).

5. Title collisions silently produce Foo2. Two genuinely distinct enums sharing a title (e.g., Status for Plan vs Job, distinct enum arrays) both get hoisted, second becomes Status2. That's the same numbered-suffix shape this PR exists to fix. At minimum console.warn so the author can rename one upstream.

Confirmed safe

  • Cycles: not a concern. resolveRefs produces an acyclic inlined tree (cycles stay as $ref).
  • oneOf/anyOf/allOf coverage: reached via the generic object branch → array branch → each member tested. Correct.
  • Array-of-enums + same-enum-in-property-and-items: both reached, fingerprint matches, both replaced.
  • Idempotency: re-running over already-hoisted bundle is a no-op (isPureEnum rejects $ref shapes; $defs/definitions blocks excluded during collect).
  • Storyboards / compliance harness: don't pattern-match inline enum arrays — consume via Ajv-compiled validators.
  • Cross-SDK codegen (datamodel-code-generator, quicktype): net-positive — both prefer named refs over inline duplicates.

Versioning

--empty (after the changeset frontmatter is fixed) is correct — bundled schemas are SDK-internal artifacts, not the public published-schema surface. No normative wire change.


Triggered by independent SDK-side review (ad-tech-protocol-expert + code-reviewer) — both ran on the full diff, not just the description.

bokelley and others added 3 commits April 26, 2026 16:38
…t collisions

Code review on this PR flagged two correctness gaps in hoistDuplicateInlineEnums:

1. The `keys.length <= 4` cap excluded enums carrying common annotations
   (`title`, `description`, `default`, `examples`, `const`, `deprecated`),
   leaving Foo/Foo1 duplicates intact for those. Drop the cap — the existing
   exclude list already filters out object/array/composition shapes.

2. Fingerprint of `{type, enum}` collapsed two enums sharing values but
   differing in `title`, with the first-seen schema winning. That would
   silently rename one of them to the other's def name. Include `title`
   in the fingerprint so distinct-titled enums stay distinct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley marked this pull request as ready for review April 26, 2026 21:22
@bokelley
Copy link
Copy Markdown
Contributor Author

Code-review pass covered correctness; two findings applied in 9aff56b:

  • Dropped keys.length <= 4 cap in isPureEnum (was excluding enums with title + description + default + examples etc., leaving Foo/Foo1 duplicates intact)
  • Added title to the fingerprint so two enums sharing values but differing in title stay distinct (was silently renaming one to the other's def)

Reviewer also flagged: no unit tests for hoistDuplicateInlineEnums covering the four branches (≥2-occurrence hoist, untitled skip, name-collision suffix, single-occurrence inline). Defensible to add in a follow-up — happy to take feedback either way.

Ready for review.

bokelley and others added 2 commits April 26, 2026 18:12
Adds 11 unit tests covering the four reviewer-flagged branches and the two
correctness fixes applied during review:

  Branches:
   - ≥2-occurrence titled enum is hoisted and references replaced
   - Untitled enum left inline (no Foo1 risk for unnamed types)
   - Single-occurrence enum left inline (no churn)
   - Name collisions with existing $defs get suffixed

  Correctness fixes:
   - isPureEnum no longer caps on key count (default/examples/deprecated/const
     all preserved through hoisting)
   - Fingerprint includes title (two enums with identical values but
     different titles stay distinct, no silent rename)

Plus three structural tests: enum-value order preservation, $defs/array
walk symmetry, and composition-keyword exclusion.

Required exporting hoistDuplicateInlineEnums from build-schemas.cjs and
gating main() behind require.main === module so tests can import the
function without triggering a full build run. Wired into the main test
chain via npm run test:build-schemas-hoist-enums.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Addressed reviewer ask for unit tests in cd9d5b1 — 11 tests in tests/build-schemas-hoist-enums.test.cjs covering:

Reviewer-flagged branches:

  • ≥2-occurrence titled enum is hoisted and references replaced
  • Untitled enum left inline (no Foo1 risk for unnamed types)
  • Single-occurrence enum left inline (no churn)
  • Name collisions with existing $defs get suffixed

The two correctness fixes from the prior commit:

  • isPureEnum no longer caps on key count — annotations like default, examples, deprecated, const are preserved through hoisting
  • Fingerprint includes title — two enums with identical values but different titles stay distinct (no silent rename)

Plus three structural tests:

  • Enum-value order preservation
  • $defs / array walk symmetry
  • Composition-keyword exclusion (oneOf/anyOf/allOf)

Required exporting hoistDuplicateInlineEnums and gating main() behind require.main === module so the test can import the function without triggering a full build. Wired into the main test chain via npm run test:build-schemas-hoist-enums.

bokelley added a commit that referenced this pull request Apr 26, 2026
…3316)

The previous workflow filtered PR comments out via
`github.event.issue.pull_request == null` with a comment claiming
"auto-fix handles those" — but no auto-fix workflow exists, and the
slack-routing only routes `/triage` slash commands. PR review feedback
(like the reviewer summaries on #3170/#3174/#3225/#3226 earlier today)
sat unactioned because nothing else was wired to pick them up.

Drop the filter. Route PR comments to the same triage routine, with
two adaptations on the payload so the routine can branch:

- `is_pr: true|false` flag at the top of the prompt
- `pr` block with head_ref, base_ref, draft, state when is_pr=true
- MODE directive when is_pr=true: "apply requested fix as a follow-up
  commit on the PR's head branch, or post a reply if it's a question"

Self-loop guard widened to skip comments containing "Fixed by Claude
Code" in addition to the existing "Triaged by Claude Code" — so the
routine's own follow-up replies on PRs don't re-fire it.

Concurrency group already keys on `issue.number`, which GitHub assigns
sequentially across both issues and PRs, so PR runs won't collide with
issue runs even when numbers happen to match.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit dfb4e01 into main Apr 27, 2026
17 checks passed
@bokelley bokelley deleted the claude/issue-3145-hoist-duplicate-enums branch April 27, 2026 03:00
bokelley added a commit that referenced this pull request May 17, 2026
…ct schemas (#4630)

* feat(schemas): x-hoist opt-in marker for canonically shared object schemas (#4557)

Opt-in companion to the pure-enum auto-hoist (`hoistDuplicateInlineEnums`,
#3170). Spec authors set `x-hoist: true` on a source schema's root; the
bundler moves it to root `$defs`, replaces every inline occurrence with a
`$ref`, and strips the directive from bundled output.

Why opt-in for complex objects: structural identity ≠ semantic identity
(BriefAsset and VASTAsset share fields today but represent different
lifecycle concepts). Auto-hoisting would lock in coupling that's hard to
walk back. `x-hoist` is the deliberate exception, not the default — and
per-type decisions stay separate from landing the mechanism.

Behavior:
- Hoists at any occurrence count (>=1). The marker declares intent; honoring
  it for single uses means adding a second reference later doesn't change
  the codegen surface.
- `title` is required — missing/empty -> build error (the directive is
  meant to be deliberate).
- Collision suffixing (`PriceBlock2`) when a `$defs` name already exists,
  matching the enum-hoist convention.
- Different titles never collapse, even with structurally identical
  bodies — preserves the BriefAsset/VASTAsset distinction by default.
- Recurses into hoisted defs so nested markers also collapse to refs.

Dry-run validation: marking `core/duration.json` and rebuilding hoisted 6
inline copies in `create-media-buy-request` (and 5 other media-buy bundles)
into a single `$defs.Duration` entry, with Ajv compiling cleanly and
`x-hoist` absent from output. Reverted before commit — no source schemas
opt in here. Per-type decisions ship in follow-ups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(schemas): address review on x-adcp-hoist directive

- Rename `x-hoist` → `x-adcp-hoist` to match the `x-adcp-*` namespacing
  convention documented in schema-extensions.mdx. Prevents collision with
  any future cross-vendor `x-hoist` keyword and matches the precedent set
  by `x-adcp-validation`.
- Strip stray `x-adcp-hoist` markers from bundled output even when they
  live inside a pre-existing $defs block (which Pass 1 skips). The
  directive must never leak into bundled artifacts regardless of where
  it was authored.
- Reject two distinct schemas marked with the same `title` — the
  collision-suffix path would silently rename one of them, defeating the
  directive's "canonical name" guarantee. Pre-existing $defs collisions
  still suffix (that case is a non-marker name collision, not a spec
  author bug).
- Wire `test:build-schemas-hoist-marked` into the npm `test` chain so the
  unit tests actually run in CI.
- Add changeset for the bundler mechanism.
- Add `docs/reference/schema-extensions.mdx` entry covering directive
  contract, bundler behavior, SDK/codegen impact, and conformance — the
  canonical reference for source-tree consumers who dereference source
  schemas directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants