docs: fix inaccurate API claims surfaced by an audit pass#257
Merged
Conversation
I cross-checked every prose claim in bin/_docs_components/<slug>.md
against the actual source under components/<Name>/ by running each
relevant API in PHP. Most claims hold, but five real misstatements in
html.md and one in markdown.md would mislead a reader who took them at
face value. Each is fixed below.
bin/_docs_components/html.md
* The pitfall "Tag closers are visited too. next_tag() stops on both
opening and closing tags" was wrong. WP_HTML_Tag_Processor::matches()
explicitly skips closing tags unless `stop_on_tag_closers` is set
(class-wp-html-tag-processor.php ~line 4279). So a guard like
`! $tags->is_tag_closer()` inside a next_tag() loop never fires.
Rewrote the pitfall to describe the actual behavior and point at
next_token() for code that wants to see closers.
* The pitfall "If you call get_breadcrumbs() on it, you'll get a thin
shape that doesn't reflect HTML5 tree construction" was wrong.
WP_HTML_Tag_Processor doesn't expose get_breadcrumbs() at all —
calling it raises `Call to undefined method`. Rewrote the pitfall
to say so explicitly.
* sanitize-html.php had `! $tags->is_tag_closer()` inside a next_tag()
loop (dead code per the corrected pitfall). Removed the guard and
added a one-line comment explaining why none is needed. Snippet
output unchanged.
* csp-nonce.php had the same dead guard. Removed.
* decode-entities.php demonstrated `attribute_starts_with()` with
`'java	script:alert(1)'` (an encoded TAB, not an encoded
letter) and var_dumped `bool(false)`. The comment said "respects
encoded colons" but `	` is a tab, and this method does NOT
handle ASCII-whitespace bypasses anyway — browsers strip ASCII
whitespace from URL attributes, but `attribute_starts_with()`
doesn't. Reader's takeaway from the misleading example was
"this function catches javascript:-with-tab", which is false.
Replaced with `'javascript:alert(1)'` (entity-encoded `j`),
which the method DOES correctly recognize as starting with
`javascript:` — the actual claim the doc is making.
* "When to use which" table row for attribute_starts_with() updated
to match: the function decodes character references, so
`javascript:` (where `a` = `a`) is correctly recognized.
bin/_docs_components/markdown.md
* Claimed "the result is a BlocksWithMetadata object" without noting
that the class lives in WordPress\DataLiberation\DataFormatConsumer,
not WordPress\Markdown. A reader who tried `use
WordPress\Markdown\BlocksWithMetadata` got a missing-class error.
Added the canonical namespace inline.
Other components audited (zip, blockparser, xml, filesystem, bytestream,
httpclient, dataliberation, encoding, git, merge, polyfill, blueprints,
cli, corsproxy, httpserver, coding-standards) — class names, method
signatures, return shapes, and constants all match the source.
Verified by re-running every snippet with bin/run-snippets.py --check:
87/87 still pass.
2 tasks
adamziel
added a commit
that referenced
this pull request
May 3, 2026
## Summary Second-round accuracy pass on `bin/_docs_components/<slug>.md`. After the first pass (#257) fixed six load-bearing inaccuracies in `html.md` and `markdown.md`, this round drilled deeper by exercising more of the underlying APIs to make sure the prose still matches reality. One small markup bug in `xml.md` falls out; everything else holds. ## What got fixed `bin/_docs_components/xml.md` — two pitfall callouts had unbalanced `<strong>` tags, an artifact of an earlier "bold the lead sentence" reformat that didn't notice the existing `<strong>#1:</strong>` prefix. The rendered HTML had one open and two closes per pitfall, so the bold span effectively ran from the `#N:` colon through the rest of the paragraph (browsers forgive the orphan close, but it doesn't render the way every other pitfall in the catalog does). ```diff - <p>Footgun: <strong>#1:</strong> namespace-aware methods use the namespace - name declared in <code>xmlns</code>, not the prefix written in the tag.</strong> + <p>Footgun: <strong>Namespace-aware methods use the namespace URI, not the + prefix written in the tag.</strong> ``` Dropped the `#N:` numbering (it added nothing — each Footgun renders as its own `<aside>`) and rewrote the bold span to wrap a single lead sentence, matching the style every other pitfall on the site uses. ## What got checked and found accurate For each component I: - **Resolved every `use WordPress\…` import** via `class_exists` / `interface_exists`. Every one resolves. - **Resolved every `->method()` call** in every snippet via reflection. Every one exists on the class returned by the preceding factory/constructor. - **Ran behavioural spot-checks** on the load-bearing claims: | Component | Claim | How I checked | | --- | --- | --- | | ByteStream | `pull → peek (non-advancing) → consume (advancing)` | Walked `abcdefghij` through `pull(5)`, `peek(3)` twice, `consume(3)`, `peek(2)` — semantics match. | | Polyfill | Filter priority ordering (lower runs first), variadic arg passing, action capture | Three filters at priorities 5/10/20 → runs 5 → 10 → 20; `do_action('x', 'a', 'b')` → 2-arg callback gets both. | | Polyfill | `esc_html`, `esc_attr`, `esc_url`, `__()` outputs | All match prose. | | Encoding | `wp_is_valid_utf8` accepts `'hello'`, rejects `"\xC3\x28"`; `wp_scrub_utf8` produces a U+FFFD replacement | All match. | | Filesystem | `LocalFilesystem::create($root)` chroot — `../escape.txt` cannot escape | Verified: `../` resolves to root-relative; no file appears outside `$root`. | | Filesystem | `SQLiteFilesystem::create(':memory:')` round-trips file contents | `put_contents('a.txt','data')` then `get_contents('a.txt')` returns `'data'`. | | ByteStream | `DeflateReadStream` / `InflateReadStream` round-trip | 550 bytes → 32 compressed → 550 bytes back, byte-for-byte. | | BlockParser | Block array has exactly five keys: `blockName, attrs, innerBlocks, innerHTML, innerContent` | Confirmed via `array_keys()`. | | BlockParser | `blockName === null` for loose HTML between blocks | Confirmed across three loose-HTML positions (before, between, after). | | Markdown | `MarkdownConsumer::consume()` returns `BlocksWithMetadata` | `get_class()` confirms the canonical namespace. | | Markdown | Round-trip via `MarkdownConsumer` + `MarkdownProducer` works for headings/lists | Confirmed; output preserves structure, with only trailing-newline drift. | | HttpClient | `Request($url, ['method' => 'POST', 'headers' => […]])` constructor accepts options array | Confirmed. | | HttpClient | `Response::from_http_headers($raw, $request)` → `Response` with `status_code` | Confirmed. | | DataLiberation | `ImportEntity('post', […])` constructor signature; `WXRWriter::append_entity` exists | Confirmed via reflection. | | Git | `GitRepository::commit(options)` exists | Confirmed in the public method list. | | XML | `XMLProcessor::create_from_string()` parses, walks `next_tag()`; `get_tag_namespace()` returns the URI not the prefix | Confirmed. | | Zip | `ZipDecoder::COMPRESSION_NONE = 0`, `COMPRESSION_DEFLATE = 8` | Confirmed via `ReflectionClass::getConstant()`. | | CLI | `CLI::parse_command_args_and_options()` exists, returns `[positionals, options]` | Confirmed. | | Blueprints | `HumanFriendlySchemaValidator` constructor takes `$schema` (required), `$options` (optional) | Confirmed. | | HttpServer | `ResponseWriteStream` is an interface (not a class) — `use` statements still valid | Confirmed via `interface_exists`. | ## Test plan - [ ] `Verify docs snippets` workflow passes (87/87). - [ ] Pitfall callouts in `xml.md` render with a single bold lead, matching the rest of the catalog. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
I audited every prose claim in
bin/_docs_components/<slug>.mdagainst the actual code undercomponents/<Name>/by running each relevant API in PHP. Most claims hold, but five misstatements inhtml.mdand one inmarkdown.mdwould mislead a reader who took them at face value.What was wrong, what's now correct
html.mdnext_tag()stops on both opening and closing tags."<p>hi</p>withnext_tag()— onlyPwas visited, never/P.WP_HTML_Tag_Processor::matches()explicitly skips closing tags unlessstop_on_tag_closersis set (class-wp-html-tag-processor.php).next_tag()only stops on opening tags; closers and text are skipped, so! $tags->is_tag_closer()inside anext_tag()loop is harmless but never fires. Pointed readers atnext_token()if they need to visit closers.get_breadcrumbs()onWP_HTML_Tag_Processor, you'll get a thin shape that doesn't reflect HTML5 tree construction"(new WP_HTML_Tag_Processor('<x>'))->get_breadcrumbs()raisesError: Call to undefined method. The method doesn't exist on Tag_Processor.get_breadcrumbs()at all — calling that raises aCall to undefined methoderror. Breadcrumbs and HTML5 tree construction live only onWP_HTML_Processor."sanitize-html.phpsnippet containedif ('SCRIPT' === $tags->get_tag() && ! $tags->is_tag_closer())is_tag_closer()guard is dead code undernext_tag(). Output is identical with or without it.csp-nonce.phpsnippet contained the same dead guarddecode-entities.phpexample:attribute_starts_with('java	script:alert(1)', 'javascript:')→bool(false), with comment "safe URL prefix check that respects encoded colons (a classic XSS vector)"	is a TAB, not an encoded colon. The method does not handle whitespace-in-the-middle attacks (browsers do strip ASCII whitespace from URL attributes, but this method does not). The misleading example told readers the function catchesjava<TAB>script:— it doesn't.javascript:alert(1)(entity-encodedj), which the method does correctly recognize as starting withjavascript:. Comment updated to describe the actual protection: "decodes character references while comparing."attribute_starts_with()— same misleading examplejavascript:(whereais the lettera) is correctly recognized as starting withjavascript:."markdown.mdBlocksWithMetadataobject" (no namespace given)get_class((new MarkdownConsumer('# h'))->consume())→WordPress\DataLiberation\DataFormatConsumer\BlocksWithMetadata. A reader who triesuse WordPress\Markdown\BlocksWithMetadatagets a missing-class error.BlocksWithMetadata(defined inWordPress\DataLiberation\DataFormatConsumer— the shared shape everyDataFormatConsumerin the toolkit emits)".What I checked and found accurate
For every component, I verified that:
use WordPress\…import in a snippet resolves to a real class or interface.->method()call hits a method that actually exists on the class returned by the preceding factory/constructor.COMPRESSION_NONE = 0,COMPRESSION_DEFLATE = 8, etc.) matchgetConstant()output.get_attribute()returnsnullfor missing attrs,WP_Block_Parser::parse()returns blocks withblockName/attrs/innerBlocks/innerHTML/innerContentkeys, loose HTML appears asblockName === null, etc.) match what running the code actually produces.Components other than
htmlandmarkdowncame back clean:zip,blockparser,xml,filesystem,bytestream,httpclient,dataliberation,encoding,git,merge,polyfill,blueprints,cli,corsproxy,httpserver,coding-standards.Test plan
Verify docs snippetsworkflow passes: 87/87 snippets still match (the two snippets I changed had their expected-output blocks re-captured automatically).decode-entities.phpsnippet display correctly.is_tag_closer()references inhtml.mdare now consistent — every guard either sits inside anext_token()loop (where it correctly fires) or is removed.🤖 Generated with Claude Code