A lightweight, DOM-like HTML and CSS parser for Node.js that creates a simple tree structure (Simple Object Model - SOM) for easy manipulation and serialization back to HTML/CSS strings. 21kb minified, zero dependencies.
- HTML Parsing: Parse HTML into a tree structure with proper handling of nested elements
- CSS Parsing: Parse inline
<style>tags with support for modern CSS features - DOM Manipulation: Insert, move, replace, and remove nodes
- Query Selectors: Find elements using CSS-like selectors
- Preserves Formatting: Maintains whitespace and indentation when manipulating nodes
- No Dependencies: Pure JavaScript implementation
Add to your project via pnpm or npm:
pnpm install simple-html-parser
# or
npm install simple-html-parserOr include manually by downloading the minified ESM dist/simple-html-parser.min.js file.
import { SimpleHtmlParser } from 'simple-html-parser';
const parser = new SimpleHtmlParser();
const dom = parser.parse('<div id="app"><h1>Hello World</h1></div>');
// Query elements
const app = dom.querySelector('#app');
const heading = dom.querySelector('h1');
// Manipulate
heading.setAttribute('class', 'title');
// Output
console.log(dom.toHtml());
// <div id="app"><h1 class="title">Hello World</h1></div>Parses an HTML string into a SOM tree structure.
const parser = new SimpleHtmlParser();
const dom = parser.parse('<div>Hello</div>');Returns the parser version.
The core building block of the SOM tree. Every element, text node, and comment is a Node.
type:'root' | 'tag-open' | 'tag-close' | 'text' | 'comment'name: Tag name (for element nodes)attributes: Object containing element attributeschildren: Array of child nodesparent: Reference to parent nodecontent: Text content (for text/comment nodes)
Find the first element matching a CSS selector.
const div = dom.querySelector('div');
const byId = dom.querySelector('#myId');
const byClass = dom.querySelector('.myClass');
const complex = dom.querySelector('div.container > p');Supported selectors:
- Tag names:
div,p,span - IDs:
#myId - Classes:
.myClass,.class1.class2 - Attributes:
[data-id],[data-id="value"] - Descendant:
div p(p inside div) - Pseudo-classes:
:not(selector)
Find all elements matching a CSS selector.
const allDivs = dom.querySelectorAll('div');
const allLinks = dom.querySelectorAll('a[href]');Find all nodes with a specific attribute.
const withDataId = dom.findAllByAttr('data-id');Add child nodes to this node.
const div = dom.querySelector('div');
const p = new Node('tag-open', 'p', {}, div);
div.appendChild(p);Insert nodes before this node (outside the element).
Note: target.insertBefore(node) inserts node before target.
const b = dom.querySelector('#B');
const a = dom.querySelector('#A');
a.insertBefore(b); // Inserts B before AInsert nodes after this node (outside the element).
Note: target.insertAfter(node) inserts node after target.
const a = dom.querySelector('#A');
const b = dom.querySelector('#B');
b.insertAfter(a); // Inserts A after BReplace this node with other nodes.
const old = dom.querySelector('#old');
const newNode = dom.querySelector('#new');
old.replaceWith(newNode); // Removes old, replaces with newRemove this node from the tree. Automatically removes matching closing tags.
const div = dom.querySelector('div');
div.remove();Get an attribute value.
const href = link.getAttribute('href');Set an attribute value.
div.setAttribute('class', 'container');Remove an attribute.
div.removeAttribute('class');Append to an attribute value.
div.updateAttribute('class', 'active'); // class="container active"CSS methods are available when parsing <style> tags.
Find at-rules (@media, @keyframes, @supports, etc.) in the CSS tree.
// Find all @media rules
const mediaRules = style.cssFindAtRules('media');
// Find all at-rules
const allAtRules = style.cssFindAtRules();Find CSS rules matching a selector.
Options:
includeCompound(default:true) - Include compound selectors like.card.activeshallow(default:false) - Exclude nested children and descendant selectors
// Find all .card rules (includes .card.active)
const cardRules = style.cssFindRules('.card');
// Find only exact .card rules
const exactCard = style.cssFindRules('.card', { includeCompound: false });
// Find #wrapper rules, excluding nested rules
const wrapperOnly = style.cssFindRules('#wrapper', { shallow: true });Find a specific CSS variable (custom property) by name.
// Find --primary-color
const primary = style.cssFindVariable('--primary-color');
// Find variable without -- prefix
const spacing = style.cssFindVariable('spacing');Find all CSS variables with their scope paths.
Options:
includeRoot(default:false) - Include 'root' in scope path for root-level variables
const vars = style.cssFindVariables();
// [{name: '--primary', value: '#007bff', scope: ':root', rule: Node}]Convert CSS rules to a formatted CSS string.
Behavior:
- Called with nodes: Converts those specific nodes
- Called on HTML node: Finds and combines all
<style>tags - Called on CSS/style node: Converts this node's CSS tree
Options:
includeComments(default:false) - Include CSS commentsincludeNestedRules(default:true) - Include nested rules within parent rulesflattenNested(default:false) - Flatten nested rules to separate top-level rules with full selectorsincludeBraces(default:true) - Include { } around declarationsincludeSelector(default:true) - Include the selectorcombineDeclarations(default:true) - Merge declarations from multiple rulessingleLine(default:false) - Output on single lineindent(default:0) - Indentation level in spaces
// Convert specific rules
const rules = style.cssFindRules('.card');
const css = style.cssToString(rules, { includeNestedRules: false });
// Convert entire style tag
const style = dom.querySelector('style');
const css = style.cssToString({ flattenNested: true });
// Combine all styles in document
const css = dom.cssToString();
// Just declarations
const css = style.cssToString(rules, {
includeSelector: false,
includeBraces: false
});
// "background: white; padding: 1rem;"Convert the node tree back to an HTML string.
const html = dom.toHtml();
const htmlWithComments = dom.toHtml(true);Alias for toHtml(true).
Nodes are iterable, allowing depth-first traversal:
for (const node of dom) {
if (node.type === 'tag-open') {
console.log(node.name);
}
}const table = dom.querySelector('table');
const rowA = dom.querySelector('#rowA');
const rowB = dom.querySelector('#rowB');
// Swap rows - insert B before A
rowA.insertBefore(rowB); // B now comes before Aconst div = new Node('tag-open', 'div', { class: 'new' });
const text = new Node('text');
text.content = 'Hello';
div.appendChild(text);
const parent = dom.querySelector('#parent');
parent.appendChild(div);const style = dom.querySelector('style');
// Get all CSS variables
const variables = style.cssFindVariables();
console.log(variables);
// [{ name: '--primary', value: '#007bff', scope: ':root', rule: Node }]
// Find specific variable
const primaryColor = style.cssFindVariable('--primary-color');
// Get .card rules (shallow - no nested)
const rules = style.cssFindRules('.card', { shallow: true });
// Convert to CSS string without nested rules
const css = style.cssToString(rules, { includeNestedRules: false });
// ".card { background: white; padding: 1rem; }"The parser treats certain tags specially:
- Void elements (
img,br,hr,input, etc.): No closing tag created - Style tags: Contents parsed as CSS
- Script tags: Can be configured via
specialTagsparameter
const parser = new SimpleHtmlParser(['script', 'custom-tag']);The parser creates a tree where:
- Opening and closing tags are siblings in the parent's children array
- Element content is in the opening tag's
childrenarray - Text nodes (including whitespace) are preserved
Example:
<div>
<p>Hello</p>
</div>Becomes:
root
└─ <div>
├─ text "\n "
├─ <p>
│ └─ text "Hello"
├─ </p>
├─ text "\n"
└─ </div>
- Regex patterns are extracted to module-level constants for reuse
- Whitespace-only text nodes are only checked during manipulation, not parsing
- Methods use private helpers to avoid duplication
Common Clause with MIT
Contributions welcome! Please ensure all tests pass and add tests for new features.
Christopher Keers - caboodle-tech