Skip to content

Generic document support#166

Merged
kepano merged 9 commits intomainfrom
generic-document-support
Mar 13, 2026
Merged

Generic document support#166
kepano merged 9 commits intomainfrom
generic-document-support

Conversation

@kepano
Copy link
Copy Markdown
Owner

@kepano kepano commented Mar 13, 2026

  • Allow defuddle/node to accept any DOM Document (linkedom, JSDOM, happy-dom, etc.) instead of requiring JSDOM. Callers choose their own DOM implementation.
  • Fixed linkedom compatibility issues: replaced isConnected checks with parentNode (linkedom doesn't support isConnected on cloned documents), and added normalize() to merge fragmented text nodes from HTML entity parsing.
  • Precompiled partial selector regex from PR Add Linkedom Node target #157 (@pedraal).
  • Switch CLI and tests from JSDOM to linkedom.
  • Tests can run against both DOM implementations: npm test (linkedom) and npm run test:jsdom.

Backward compatibility

Passing an HTML string or JSDOM instance to Defuddle() still works but is deprecated. Pass a Document instead:

// Before (deprecated)
await Defuddle(htmlString);
await Defuddle(jsdomInstance);

// After (linkedom)
import { parseHTML } from 'linkedom';
const { document } = parseHTML(htmlString);
await Defuddle(document, 'https://example.com/article');

// After (JSDOM)
import { JSDOM } from 'jsdom';
const dom = new JSDOM(htmlString, { url: 'https://example.com/article' });
await Defuddle(dom.window.document, 'https://example.com/article');

pedraal and others added 9 commits March 12, 2026 17:50
Avoids rebuilding the 534-pattern regex and attribute selector string
on every parse call.

From PR #157 by @pedraal.
- node.ts now accepts `string | Document` — pass a Document from
  linkedom, happy-dom, or any DOM implementation directly
- JSDOM is dynamically imported when a string is passed, with a
  helpful error if not installed
- Move jsdom from peerDependencies to optionalDependencies
- Fix `isConnected` → `parentNode` in standardize.ts (linkedom's
  cloneNode doesn't set isConnected)
- Add `normalize()` after cloning to merge fragmented text nodes
  from DOM implementations that split HTML entities
- Add linkedom test suite verifying output matches expected fixtures
Switch CLI, tests, and all DOM parsing to linkedom. This removes
JSDOM as a dependency entirely, halving test runtime and simplifying
the dependency tree.
@kepano kepano merged commit f0b9e78 into main Mar 13, 2026
@kepano kepano mentioned this pull request Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants