Skip to content

Draft php-transformer package#1

Merged
chubes4 merged 49 commits into
trunkfrom
draft/php-transformer-monorepo
Jun 19, 2026
Merged

Draft php-transformer package#1
chubes4 merged 49 commits into
trunkfrom
draft/php-transformer-monorepo

Conversation

@chubes4

@chubes4 chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Introduces php-transformer as the origin-clean PHP package for Blocks Engine transformation primitives. The package lives at php-transformer/, publishes automattic/blocks-engine-php-transformer, and uses the Automattic\BlocksEngine\PhpTransformer\ namespace.

This PR establishes the reusable transformer layer for HTML, declared content formats, and generated website artifact bundles. It intentionally excludes product workflows such as importer UI, ZIP intake, theme activation, Studio orchestration, deployment behavior, and generation-loop policy.

Public API

Current public entrypoints:

  • Contract\TransformerResult for the serializable result envelope.
  • Contract\TransformationOptions for generic transform context, strict/fallback policy, and source provenance.
  • HtmlToBlocks\HtmlTransformer for supported HTML to parsed block arrays and serialized block markup.
  • FormatBridge\FormatBridge for html, markdown, and serialized blocks normalization/conversion.
  • FormatBridge\FormatAdapterInterface for package-level format extension.
  • ArtifactCompiler\ArtifactCompiler for generated website artifact bundle normalization.
  • WordPress\Runtime for WordPress calls that must work inside and outside WordPress.

Other classes are implementation details unless the README explicitly marks them public or they are injected adapter contracts.

What Is Included

  • Composer package scaffold, autoloading, lockfile, README, package docs, and install proof.
  • Homeboy-managed release config in php-transformer/homeboy.json with independent package version state in php-transformer/VERSION.
  • Shared result envelope with status, blocks, serialized blocks, documents, assets, diagnostics, fallbacks, provenance, coverage, context, metrics, and migration mapping metadata.
  • HTML transformer coverage for core text/media/action primitives, inline preservation, wrapper/group preservation, definition lists, forms as fallbacks, unsupported-element fallbacks, and recursive metrics.
  • Format bridge adapters for HTML, Markdown, and serialized blocks.
  • Artifact compiler normalization for generated site bundles, source documents, frontmatter, MDX component metadata, block.json discovery, dependency reports, asset manifests, safe SVG/image handling, and image reference provenance.
  • Parity fixture harness with 17 canonical fixtures.
  • Explicit migration evidence and downstream wrapper plans under consumer/migration docs, kept separate from canonical package docs/API.

Release Direction

php-transformer is independently versioned from the repository container. Homeboy tracks php-transformer/VERSION; Composer stays VCS/tag-driven and does not carry an explicit version field, so composer validate --strict remains clean.

Actual releases should run from the default branch after merge. A release dry-run from this draft branch reached Homeboy planning, then stopped on the default-branch preflight as expected for non-default release branches.

Migration Direction

Review this as a new standalone package, not as a renamed old repository. Existing HTML converter, format bridge, artifact compiler, and Static Site Importer paths are downstream consumers, wrappers, or migration evidence.

Reusable transformation behavior should move into php-transformer. Existing repositories should become thin downstream entrypoints first, then be archived when public package names, functions, hooks, CLI commands, abilities, and product integrations no longer need compatibility surfaces. The intended completion criteria is that the old implementation repositories no longer carry canonical transformation logic; php-transformer is the canonical implementation.

static-site-importer should remain a product/plugin consumer. It should consume transformer APIs through product-owned adapters rather than become part of this package.

Verification

Latest verification from php-transformer/ passed:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:examples
  • composer run test:migration:legacy-parity
  • git diff --check
  • homeboy component show php-transformer
  • homeboy release php-transformer --dry-run --skip-publish --no-github-release reached release planning, then stopped on default-branch preflight from this draft branch

Opt-in local migration parity with existing local old-repo paths also passed: 17 fixtures, 1 migration comparison, 16 explicit skips.

Remaining Follow-Ups

These should be smaller follow-up PRs after this foundation lands:

  • More complete safe SVG/image serialization behavior.
  • Richer class/style/layout provenance where it is generic transformer behavior.
  • Additional selector/source provenance beyond image references.
  • Downstream wrapper PRs for old public entrypoints once this package has a stable tag/constraint.
  • Product-level Static Site Importer adapter work outside this package.

AI assistance

  • AI assistance: Yes
  • Tool(s): OpenCode (GPT-5.5)
  • Used for: Drafting and implementing substantial portions of the package scaffold, transformer contracts, tests, docs, release config, and PR text. Chris remains responsible for review, verification, and final submission.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Fleet wave integrated into this draft branch.

Added:

  • Minimal HtmlToBlocks\HtmlTransformer implementation for headings, paragraphs, lists, containers, fallbacks, provenance, and serialized output.
  • FormatBridge primitives: adapter interface, registry, normalizer, supported formats, and conversion stubs through a block pivot.
  • Richer ArtifactCompiler result contract with status, components, block types, source reports, and legacy BAC mapping.
  • Wrapper migration docs and compatibility example skeletons for the four existing repos.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer test

Both passed after integrating the parallel slices.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Second parallel fleet integrated.

Added:

  • Expanded HtmlToBlocks parity slice: quote, pullquote, code/preformatted, table, image, buttons/button, shortcode, stronger fallback capture.
  • Real FormatBridge adapters: BlocksAdapter, HtmlAdapter, default adapter wiring, runtime parse/serialize/render fallbacks.
  • BAC normalization safeguards: aliases, shorthand CSS/JS files, MIME/kind/role/intent inference, hashes, limits, richer source reports.
  • Parity fixture harness with JSON fixture schema and initial html/markdown/artifact/fallback fixtures.
  • Concrete downstream consumer PR plans for the four existing repos.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • git diff --check

All passed. Parity harness result: 4 fixtures passed, 3 legacy comparisons skipped by metadata.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Third parallel fleet integrated.

Added:

  • Markdown adapter with league/commonmark and league/html-to-markdown, wired into FormatBridge for Markdown -> blocks and blocks -> Markdown.
  • HtmlToBlocks BlockFactory extraction to keep the transformer maintainable while preserving current behavior.
  • ArtifactCompiler document/frontmatter extraction for Markdown/MDX source documents, document metadata, diagnostics, provenance, and safe fallback block markup.
  • Optional local-only legacy parity runner support with explicit opt-in and skip reasons.
  • Packaging/release plan covering Composer naming, monorepo install options, prefixing policy, versioning, wrapper release order, and draft-exit criteria.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • git diff --check

All passed. Parity harness result: 6 fixtures passed, 0 legacy comparisons, 6 legacy comparisons skipped by metadata.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Fourth parallel fleet integrated.

Added:

  • WordPress runtime adapter contract with explicit parse/serialize/render, shortcode, strip-tags, escaping, JSON, no-WP diagnostics, and stubbed-WP tests.
  • Structured HTML transform coverage fixtures and coverage matrix for supported/context-required transforms.
  • Artifact block/component metadata: richer block.json discovery, file/dependency refs, JSX/TSX component candidates, semantic component candidates, and provenance.
  • Executable compatibility wrapper examples with isolated smoke calls for h2bc, BFB, BAC, and SSI adapter examples.
  • Static Site Importer adapter contract mapping TransformerResult to SSI/BAC-shaped envelopes for convert_fragment, compile_website_artifact, and blocks_to_html.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • git diff --check

All passed. Parity harness result: 9 fixtures passed, 0 legacy comparisons, 9 legacy comparisons skipped by metadata.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Sixth parallel fleet integrated.

This wave tightened the intended direction: php-transformer is origin-clean and does not depend on, name, or promise permanent compatibility with the old plugins. The old repos are downstream consumers/wrappers during migration.

Added/changed:

  • Canonical docs now frame php-transformer as a standalone product primitive, not a convergence layer that carries old repo identity.
  • Public API docs narrowed to the canonical surfaces: HtmlTransformer, FormatBridge, ArtifactCompiler, TransformerResult, Runtime, and FormatAdapterInterface.
  • Migration/downstream docs now describe temporary wrappers, thin-shim/archive exits, and no permanent compatibility guarantee.
  • Test scripts are organized into canonical contracts, downstream example smoke tests, parity fixtures, and an explicit opt-in migration comparison script.
  • PR review guide added for human review focus areas and draft readiness.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Default parity now reports only canonical fixture results. Opt-in migration parity with local old repo paths also passed: 15 fixtures, 1 migration comparison, 14 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Seventh parallel fleet integrated.

This wave made the repo-consolidation strategy explicit: collapse implementation into blocks-engine/php-transformer, but do not immediately archive popular old repos. Old repos remain downstream entrypoints/temporary consumers until usage and compatibility evidence say they can become thin shims or be archived.

Added/changed:

  • Repo consolidation policy: keep-open vs archive criteria, repo-by-repo fate, issue routing, package discoverability, and archive safety gates.
  • Composer/GitHub continuity guidance: old package names stay downstream; php-transformer should not replace/provide old package names.
  • Wrapper release playbooks for H2BC, BFB, BAC, and Static Site Importer: SemVer, release-note text, smoke tests, rollback paths, archive/thin-shim gates.
  • First consumer plan: html-to-blocks-converter is the recommended first downstream wrapper PR, with exact delegation plan and expected test commands.
  • PR body draft in docs/pr-body-draft.md with origin-clean direction, implementation-collapse vs repo-archive distinction, current API, verification, next steps, and AI disclosure.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Opt-in migration parity with local old repo paths also passed: 15 fixtures, 1 migration comparison, 14 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Eighth cooking wave integrated.

Direction reinforced:

  • php-transformer remains origin-clean.
  • Downstream repos can know and consume php-transformer; php-transformer should not know or depend on those repos.

Upstream changes pushed:

  • Generic contract gap fixes discovered by the first downstream dry run: richer fallback metadata, stable list-shaped block arrays from FormatBridge::toBlocks(), and single-pass convertResult() option forwarding.
  • Origin-clean cleanup: migration examples/docs moved under consumer migration docs, canonical docs/tests avoid reading as if the package knows its ancestors.
  • Package install proof documented in php-transformer/docs/install-proof.md.
  • PR body draft clarified in docs/pr-body-draft.md; GitHub PR body was updated by the wave agent.

Downstream dry run:

  • Created local-only html-to-blocks-converter@cook-consume-php-transformer-dry-run.
  • Dependency wiring works through a local path repo.
  • Local dry-run commit: 7deaa03de805033d812cf9d598df4da768969eac in the H2BC worktree, not pushed.
  • composer validate --strict passed.
  • PHP lint passed for edited files.
  • Smoke script sweep: 13 pass, 56 fail. Many failures are non-standalone test environment gaps or legacy implementation-shape expectations; real upstream gaps were captured.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Opt-in migration parity with local old repo paths also passed: 15 fixtures, 1 migration comparison, 14 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Ninth cooking wave integrated.

Focus: generic upstream gaps from the downstream dry run, without making php-transformer aware of downstream packages.

Upstream changes pushed:

  • Added generic TransformationOptions contract and context result metadata for strict/fallback policy plus source/scope provenance.
  • Added generic metrics envelope: input/output bytes, recursive block count, fallback count, diagnostic count, transform duration.
  • Expanded generic HTML primitives: definition lists, inline mark preservation, class/style preservation, form fallback diagnostics, decorative/group wrappers.
  • Further isolated migration evidence from canonical package scripts/docs; migration examples moved under explicit migration paths.

Downstream continuation:

  • Local-only H2BC dry-run branch html-to-blocks-converter@cook-consume-php-transformer-dry-run-2 has commit 2e68eef restoring downstream wrapper metrics emission.
  • Validation/lint passed; standalone smoke sweep now 13 passed / 53 failed.
  • Remaining failures classified as environment-only, legacy implementation-shape, downstream wrapper bug fixed locally, or generic upstream gaps.
  • New generic gaps still open: safe SVG artifact output, selector/source provenance, resized SVG/image serialization, wrapper class/layout provenance, and richer unsupported-element diagnostics.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:examples
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Canonical parity now has 16 fixtures. Opt-in migration parity with local old repo paths also passed: 16 fixtures, 1 migration comparison, 15 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Tenth cooking wave integrated.

Focus: close the first generic artifact/media gap from the downstream dry run without introducing downstream package awareness.

Upstream change pushed:

  • Added an artifact-local image asset transform path. Entry HTML remains raw core/html by default, but when it contains <img> references that resolve to artifact image files, the compiler now runs the canonical HTML transformer and exposes structured core/image blocks.
  • Added safe SVG payload handling: safe text SVG image assets keep inspectable content in the asset manifest; scriptable SVG markup is diagnosed and not exposed inline.
  • Artifact metrics now count transformed entry blocks/fallbacks when the asset-aware path is used.
  • Added parity fixture artifact-image-assets, bringing canonical parity coverage to 17 fixtures.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:examples
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Opt-in migration parity with local old repo paths also passed: 17 fixtures, 1 migration comparison, 16 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Eleventh cooking wave integrated.

Focus: generic source/reference provenance for artifact image handling.

Upstream change pushed:

  • Added source_reports.artifact.image_references for entry HTML image references.
  • Each reference reports source_path, simple selector, original src, resolved artifact path, matched asset path, MIME type, bytes, and safety status when an artifact asset is found.
  • Tightened relative-reference resolution so absolute/external URLs do not get misleading artifact paths.
  • Extended artifact-image-assets parity coverage for the provenance contract.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:examples
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Opt-in migration parity with local old repo paths also passed: 17 fixtures, 1 migration comparison, 16 explicit skips.

@chubes4

chubes4 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Twelfth cooking wave integrated.

Focus: richer generic unsupported-element diagnostics from the downstream dry-run gap list.

Upstream change pushed:

  • Unsupported HTML fallbacks now include deterministic selector paths, reason, attributes, text length, and child element count.
  • Form/runtime-required fallbacks now carry the same metadata.
  • Diagnostics created from fallbacks now expose selector and reason so callers can connect diagnostics back to fallback records.
  • Extended existing unsupported HTML parity fixtures instead of adding a narrow smoke-only test.

Verification from php-transformer/:

  • composer install --no-interaction
  • composer validate --strict
  • composer test
  • composer run test:migration:examples
  • composer run test:migration:legacy-parity
  • git diff --check

All passed. Opt-in migration parity with local old repo paths also passed: 17 fixtures, 1 migration comparison, 16 explicit skips.

@chubes4 chubes4 marked this pull request as ready for review June 19, 2026 19:02
@chubes4 chubes4 merged commit 10b5950 into trunk Jun 19, 2026
@chubes4 chubes4 deleted the draft/php-transformer-monorepo branch June 19, 2026 19:02
borkweb added a commit that referenced this pull request Jun 30, 2026
…ry + metrics) (#426)

* blocks-engine: add convert report envelope design spec

Design for porting php-transformer's structured result envelope into packages/blocks-engine as item #1 of docs/to-port-to-js.md: a new canonical convertReport() returning { schema, status, blockMarkup, fallbacks, diagnostics, metrics }, with convert() projected to .blockMarkup. Covers the tolerant ConversionFinding shape, core/html-as-fallback-source inventory, metrics, a pure unit-testable buildReport core, and an assertConvertReport contract. Scopes ratchet integration and hallucination findings out to later items.

* blocks-engine: fold deep-review deltas into convert report spec

Incorporates the four decisions from the plan deep review: (1) block analysis moves into the worker (Approach A) so buildReport stays pure and WP-free and the no-wordpress-runtime-deps contract holds, reusing walkBlocks and adding blockCount/htmlIslands to FixResult; (2) conversion_degraded diagnostic when the pool sentinel fires; (3) unconverted_html inventory capped at 100 with a fallback_inventory_truncated diagnostic while metrics.fallbackCount keeps the true total; (4) snippet sanitized, char-based (multibyte-safe) truncation, documented as untrusted. Updates sections 3-8 and the testing matrix accordingly.

* blocks-engine: freeze convert report contracts

* blocks-engine: add structured convert report

* blocks-engine: add content dropped finding code

* blocks-engine: harden convert report honesty

* blocks-engine: freeze theme conversion diagnostics contract

* blocks-engine: surface conversion diagnostics in themes

* blocks-engine: record deferred convert-report follow-ups

Capture the 6 review-deferred follow-ups (invalid-block severity, fallbackCount/length doc, inventory perf gate per council C, snippet render-safety, failed-status decision, purity import-graph guard) in the spec so nothing is lost after the branch lands. Docs only.

* blocks-engine: make theme conversion diagnostics field optional

Declare ThemeDiagnostics.conversion optional so adding it is backward-compatible for external code that constructs a ThemeBuildResult/ThemeDiagnostics literal (e.g. downstream test fixtures), which would otherwise have to add the new field to compile. siteToTheme still always populates it, so readers are unaffected at runtime. Keeps the change purely additive for consumers like data-liberation-agent.

* blocks-engine: never commit superpowers docs; ignore nested path

The root .gitignore pattern 'docs/superpowers/' is anchored to the repo root, so it did not cover 'packages/blocks-engine/docs/superpowers/' — which let a design spec get committed by mistake. Switch to '**/docs/superpowers/' so the path is ignored at any depth, and remove the spec doc that should not have been committed.

* blocks-engine: ignore docs/superpowers at any depth

Root pattern 'docs/superpowers/' is anchored to the repo root and did not cover 'packages/blocks-engine/docs/superpowers/', which let a design spec get committed by mistake. Use '**/docs/superpowers/' so superpowers docs are ignored at any depth.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant