Skip to content

docs: rewrite tooling in PHP, drop Python dependency#263

Merged
adamziel merged 2 commits intotrunkfrom
docs-tooling-php-rewrite
May 3, 2026
Merged

docs: rewrite tooling in PHP, drop Python dependency#263
adamziel merged 2 commits intotrunkfrom
docs-tooling-php-rewrite

Conversation

@adamziel
Copy link
Copy Markdown
Collaborator

@adamziel adamziel commented May 3, 2026

Summary

The toolkit now documents itself using only its own runtime and its own structured-data parsers. No Python in CI; no hand-rolled regex over markdown or HTML.

What changed

Was Now
bin/_docs_components.py (200 lines, dead-code dicts) deleted
bin/_load_catalog.py (370 lines: hand-rolled YAML-subset parser, regex section splitter, regex snippet extractor) replaced with proper parsers
bin/build-reference.py (220 lines) bin/build-reference.php
bin/run-snippets.py (230 lines) bin/run-snippets.php
bin/serve-docs.py (50 lines) bin/serve-docs.php (uses php -S with router)

Parsers

Surface Parser
README YAML frontmatter (slug, title, install, credit_*, see_also) Webuni\FrontMatter\FrontMatter::parse() — already vendored under components/Markdown/vendor-patched/ for the Markdown component. Single-line, multi-line `key:
Markdown body → AST League\CommonMark\Parser\MarkdownParser + walking the document. Section boundaries = Heading nodes at level 2. Snippets = HtmlBlock (<!-- snippet: -->) → FencedCode (info=php) tuples. Expected-output = HtmlBlock (<!-- expected-output -->) → FencedCode pair. Body content rendered via HtmlRenderer::renderNodes() so raw HTML round-trips verbatim.
Pitfall callouts (HtmlBlocks of the form <p>Footgun: …</p>) WP_HTML_Tag_Processor — walks tokens, confirms a <p> opener, finds the first inner #text node, classifies, strips the Footgun: / Gotcha: prefix via set_modifiable_text(), then slices off the outer <p>...</p> by length (no regex).
Lede paragraph → inline HTML (no outer <p>) Render the lede Paragraph node's inline children directly via HtmlRenderer::renderNodes() instead of slicing afterward.
Snippet metadata comment (<!-- snippet: filename: x.php\nrunnable: true\n-->) String slicing of literal <!-- / --> delimiters — no regex.
--update writing captured stdout back into a README CommonMark AST locates the snippet's exact line range; line-by-line splice. No regex over the README.

Behavioural parity

  • 87/87 snippets match their captured stdout. The PHP normalizer mirrors the Python regex set 1:1, so existing expected-output blocks stay valid.
  • docs/reference/*.html render correctly: 9 <php-snippet> + 9 fallback + 9 expected-output triples on html.html, 4 pitfall callouts (the bold-lead pattern is preserved), see-also list intact.
  • --update verified end-to-end:
    • When an expected-output block has drifted, the rewritten content is byte-for-byte the original.
    • When no expected-output exists for a snippet, --update inserts a fresh block in the right place; resulting README is byte-identical to one with the block authored manually.

Frontmatter format change: see_also is a proper YAML list

# Before (repeated keys — not standard YAML)
see_also: a | A | reason
see_also: b | B | reason

# After
see_also:
  - a | A | reason
  - b | B | reason

Webuni\FrontMatter\FrontMatter correctly types it as a sequence; any frontmatter-aware tool reading the README sees the same shape. All 18 component READMEs migrated.

Workflows simplified

  • snippet-tests.yml drops the actions/setup-python step.
  • docs.yml swaps python3 bin/build-reference.py for php bin/build-reference.php.

Remaining preg_* calls (all on plain text, not HTML)

  • slugify() — heading text → URL-safe slug.
  • normalize() — scrubs noise from snippet stdout (tempfile paths, git hashes, timestamps).
  • One pattern in run-snippets that matches the require '...autoload.php'; line in the snippet's PHP source to inject the local-prelude polyfill.

These operate on plain strings, not HTML, so they're not what the "no regex over HTML" rule was about.

Test plan

  • Verify docs snippets workflow passes (87/87).
  • Deploy docs to GitHub Pages runs cleanly on push to trunk.
  • Local preview: bash bin/build-docs-bundle.sh && php bin/serve-docs.phphttp://localhost:8787 renders all reference pages with snippets.
  • php bin/run-snippets.php --update (no-op, no drift) leaves all READMEs untouched.

🤖 Generated with Claude Code

Base automatically changed from docs-drop-refinement-prefix to trunk May 3, 2026 21:36
@adamziel adamziel force-pushed the docs-tooling-php-rewrite branch 7 times, most recently from e8fab36 to 93b466c Compare May 3, 2026 22:33
Stack on top of #262 (the heading-prefix change). Replaces 1,128 lines of
Python tooling with 762 lines of PHP. The toolkit now documents itself
using only its own runtime — no Python in CI, no second language to
context-switch into when iterating on the docs.

Replaces:
  bin/_docs_components.py        (dead-code dicts, 200 lines)
  bin/_load_catalog.py           (frontmatter + section + snippet parser, 370 lines)
  bin/build-reference.py         (docs/reference/<slug>.html generator, 220 lines)
  bin/run-snippets.py            (snippet runner + expected-output checker, 230 lines)
  bin/serve-docs.py              (CORS-enabled local preview server, 50 lines)

With:
  bin/build-reference.php        (parser + renderer, 480 lines — also the
                                  catalog-loading library, since
                                  run-snippets.php requires it)
  bin/run-snippets.php           (runner + expected-output writer, 250 lines)
  bin/serve-docs.php             (php -S router with CORS headers, 50 lines)

Behavioural parity:
  - 87/87 snippets match captured stdout (same expected-output blocks
    as before; the new parser produces byte-identical normalization and
    fence handling).
  - docs/reference/*.html renders byte-equivalent output: same HTML
    structure, same snippet/fallback/expected-output triples, same
    pitfall extraction, same see-also rendering.
  - --update writes captured stdout back into the same expected-output
    fence in the README via the same slug → directory map.

Format change: see_also frontmatter switched from repeated keys
(`see_also: a | A | r` × 3) to a proper YAML list:

    see_also:
      - a | A | reason
      - b | B | reason

This is standard YAML — readable by any frontmatter-aware tool, and
correctly typed as a sequence in GitHub's README renderer. The Python
parser never accepted the YAML-list form; the new PHP parser only
accepts that form.

Workflows simplified:
  - snippet-tests.yml drops the `actions/setup-python` step.
  - docs.yml drops `python3 bin/build-reference.py` for `php
    bin/build-reference.php`.

Per the prompt in docs-changes.md ("Change all the markdown-handling
python tooling to use PHP. Reuse components from this repo."): the new
PHP code uses standard library facilities (preg_*, proc_open) — none of
the toolkit components needed any new feature, so no stack-base PR was
required.
@adamziel adamziel force-pushed the docs-tooling-php-rewrite branch from 93b466c to c09e068 Compare May 3, 2026 22:38
The bin/build-reference.php, bin/run-snippets.php, and bin/serve-docs.php
scripts are tooling that runs in CI on PHP 8.3, not library code. They
don't need PHP 7.2 compatibility or WordPress-coding-standards compliance,
matching how /bin/build-phar is already excluded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adamziel adamziel merged commit eb82d80 into trunk May 3, 2026
28 of 29 checks passed
@adamziel adamziel deleted the docs-tooling-php-rewrite branch May 3, 2026 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant