diff --git a/README.md b/README.md index ac9de08e..6e67e723 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # HTML Minifier Next (HTMLMinifier) -[](https://www.npmjs.com/package/html-minifier-next) +[](https://www.npmjs.com/package/html-minifier-next) (This project is based on [Terser’s html-minifier-terser](https://github.com/terser/html-minifier-terser), which in turn is based on [Juriy Zaytsev’s html-minifier](https://github.com/kangax/html-minifier). It was set up because as of May 2025, both html-minifier-terser and html-minifier seem unmaintained. **This project is currently under test.** If it seems maintainable to me, [Jens](https://meiert.com/), even without community support, the project will be updated and documented further. The following documentation largely matches the original project.) @@ -70,7 +70,7 @@ How does HTMLMinifier compare to other solutions — [HTML Minifier from Will Pe | [W3C](https://www.w3.org/) | 51 | **36** | 42 | n/a | | [Wikipedia](https://en.wikipedia.org/wiki/Main_Page) | 114 | **100** | 107 | n/a | -## Options Quick Reference +## Options quick reference Most of the options are disabled by default. @@ -78,6 +78,7 @@ Most of the options are disabled by default. | --- | --- | --- | | `caseSensitive` | Treat attributes in case sensitive manner (useful for custom HTML tags) | `false` | | `collapseBooleanAttributes` | [Omit attribute values from boolean attributes](http://perfectionkills.com/experimenting-with-html-minifier#collapse_boolean_attributes) | `false` | +| `customFragmentQuantifierLimit` | Set maximum quantifier limit for custom fragments to prevent ReDoS attacks | `200` | | `collapseInlineTagWhitespace` | Don’t leave any spaces between `display:inline;` elements when collapsing. Must be used in conjunction with `collapseWhitespace=true` | `false` | | `collapseWhitespace` | [Collapse white space that contributes to text nodes in a document tree](http://perfectionkills.com/experimenting-with-html-minifier#collapse_whitespace) | `false` | | `conservativeCollapse` | Always collapse to 1 space (never remove it entirely). Must be used in conjunction with `collapseWhitespace=true` | `false` | @@ -92,6 +93,7 @@ Most of the options are disabled by default. | `ignoreCustomFragments` | Array of regexes that allow to ignore certain fragments, when matched (e.g. ``, `{{ ... }}`, etc.) | `[ /<%[\s\S]*?%>/, /<\?[\s\S]*?\?>/ ]` | | `includeAutoGeneratedTags` | Insert tags generated by HTML parser | `true` | | `keepClosingSlash` | Keep the trailing slash on singleton elements | `false` | +| `maxInputLength` | Maximum input length to prevent ReDoS attacks (disabled by default) | `undefined` | | `maxLineLength` | Specify a maximum line length. Compressed output will be split by newlines at valid HTML split-points | | `minifyCSS` | Minify CSS in style elements and style attributes (uses [clean-css](https://github.com/jakubpawlowicz/clean-css)) | `false` (could be `true`, `Object`, `Function(text, type)`) | | `minifyJS` | Minify JavaScript in script elements and event attributes (uses [Terser](https://github.com/terser/terser)) | `false` (could be `true`, `Object`, `Function(text, inline)`) | @@ -154,6 +156,63 @@ Output of resulting markup (e.g. `
foo
`) HTMLMinifier can’t know that original markup was only half of the tree; it does its best to try to parse it as a full tree and it loses information about tree being malformed or partial in the beginning. As a result, it can’t create a partial/malformed tree at the time of the output. +## Security + +### ReDoS protection + +This minifier includes protection against regular expression denial of service (ReDoS) attacks: + +* Custom fragment quantifier limits: The `customFragmentQuantifierLimit` option (default: 200) prevents exponential backtracking by replacing unlimited quantifiers (`*`, `+`) with bounded ones in regular expressions. + +* Input length limits: The `maxInputLength` option allows you to set a maximum input size to prevent processing of excessively large inputs that could cause performance issues. + +* Enhanced pattern detection: The minifier detects and warns about various ReDoS-prone patterns including nested quantifiers, alternation with quantifiers, and multiple unlimited quantifiers. + +**Important:** When using custom `ignoreCustomFragments`, ensure your regular expressions don’t contain unlimited quantifiers (`*`, `+`) without bounds, as these can lead to ReDoS vulnerabilities. + +(Further improvements are needed. Contributions welcome.) + +#### Custom fragment examples + +**Safe patterns** (recommended): + +```javascript +ignoreCustomFragments: [ + /<%[\s\S]{0,1000}?%>/, // JSP/ASP with explicit bounds + /<\?php[\s\S]{0,5000}?\?>/, // PHP with bounds + /\{\{[^}]{0,500}\}\}/ // Handlebars without nested braces +] +``` + +**Potentially unsafe patterns** (will trigger warnings): + +```javascript +ignoreCustomFragments: [ + /<%[\s\S]*?%>/, // Unlimited quantifiers + //, // Could cause issues with very long comments + /\{\{.*?\}\}/, // Nested unlimited quantifiers + /(script|style)[\s\S]*?/ // Multiple unlimited quantifiers +] +``` + +**Template engine configurations:** + +```javascript +// Handlebars/Mustache +ignoreCustomFragments: [/\{\{[\s\S]{0,1000}?\}\}/] + +// Liquid (Jekyll) +ignoreCustomFragments: [/\{%[\s\S]{0,500}?%\}/, /\{\{[\s\S]{0,500}?\}\}/] + +// Angular +ignoreCustomFragments: [/\{\{[\s\S]{0,500}?\}\}/] + +// Vue.js +ignoreCustomFragments: [/\{\{[\s\S]{0,500}?\}\}/] +``` + +**Important:** When using custom `ignoreCustomFragments`, the minifier automatically applies bounded quantifiers to prevent ReDoS attacks, but you can also write safer patterns yourself using explicit bounds. + ## Running benchmarks Benchmarks for minified HTML: diff --git a/cli.js b/cli.js index d085c8a2..3ccfcc22 100755 --- a/cli.js +++ b/cli.js @@ -101,6 +101,7 @@ function parseString(value) { const mainOptions = { caseSensitive: 'Treat attributes in case sensitive manner (useful for SVG; e.g. viewBox)', collapseBooleanAttributes: 'Omit attribute values from boolean attributes', + customFragmentQuantifierLimit: ['Set maximum quantifier limit for custom fragments to prevent ReDoS attacks (default: 200)', parseInt], collapseInlineTagWhitespace: 'Collapse white space around inline tag', collapseWhitespace: 'Collapse white space that contributes to text nodes in a document tree.', conservativeCollapse: 'Always collapse to 1 space (never remove it entirely)', @@ -115,6 +116,7 @@ const mainOptions = { ignoreCustomFragments: ['Array of regex\'es that allow to ignore certain fragments, when matched (e.g. , {{ ... }})', parseJSONRegExpArray], includeAutoGeneratedTags: 'Insert tags generated by HTML parser', keepClosingSlash: 'Keep the trailing slash on singleton elements', + maxInputLength: ['Maximum input length to prevent ReDoS attacks', parseInt], maxLineLength: ['Max line length', parseInt], minifyCSS: ['Minify CSS in style elements and style attributes (uses clean-css)', parseJSON], minifyJS: ['Minify Javascript in script elements and on* attributes', parseJSON], @@ -304,4 +306,4 @@ if (inputDir || outputDir) { process.stdin.on('data', function (data) { content += data; }).on('end', writeMinify); -} \ No newline at end of file +} diff --git a/package-lock.json b/package-lock.json index 291d3012..88cbc325 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "html-minifier-next", - "version": "1.0.1", + "version": "1.1.0", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "html-minifier-next", - "version": "1.0.1", + "version": "1.1.0", "license": "MIT", "dependencies": { "change-case": "^4.1.2", diff --git a/package.json b/package.json index 356d0cf4..2b4a4dd1 100644 --- a/package.json +++ b/package.json @@ -90,5 +90,5 @@ "test:web": "NODE_OPTIONS='--experimental-vm-modules --no-warnings' jest --verbose --environment=jsdom" }, "type": "module", - "version": "1.0.1" + "version": "1.1.0" } \ No newline at end of file diff --git a/src/htmlminifier.js b/src/htmlminifier.js index 6c2d6ddf..6482a7b6 100644 --- a/src/htmlminifier.js +++ b/src/htmlminifier.js @@ -844,6 +844,11 @@ async function createSortFns(value, options, uidIgnore, uidAttr) { } async function minifyHTML(value, options, partialMarkup) { + // Check input length limitation to prevent ReDoS attacks + if (options.maxInputLength && value.length > options.maxInputLength) { + throw new Error(`Input length (${value.length}) exceeds maximum allowed length (${options.maxInputLength})`); + } + if (options.collapseWhitespace) { value = collapseWhitespace(value, options, true, true); } @@ -888,8 +893,24 @@ async function minifyHTML(value, options, partialMarkup) { return re.source; }); if (customFragments.length) { - const reCustomIgnore = new RegExp('\\s*(?:' + customFragments.join('|') + ')+\\s*', 'g'); - // temporarily replace custom ignored fragments with unique attributes + // Warn about potential ReDoS if custom fragments use unlimited quantifiers + for (let i = 0; i < customFragments.length; i++) { + if (/[*+]/.test(customFragments[i])) { + options.log('Warning: Custom fragment contains unlimited quantifiers (* or +) which may cause ReDoS vulnerability'); + break; + } + } + + // Safe approach: Use bounded quantifiers instead of unlimited ones to prevent ReDoS + const maxQuantifier = options.customFragmentQuantifierLimit || 200; + const whitespacePattern = `\\s{0,${maxQuantifier}}`; + + // Use bounded quantifiers to prevent ReDoS - this approach prevents exponential backtracking + const reCustomIgnore = new RegExp( + whitespacePattern + '(?:' + customFragments.join('|') + '){1,' + maxQuantifier + '}' + whitespacePattern, + 'g' + ); + // Temporarily replace custom ignored fragments with unique attributes value = value.replace(reCustomIgnore, function (match) { if (!uidAttr) { uidAttr = uniqueId(value); diff --git a/tests/minifier.spec.js b/tests/minifier.spec.js index 067ef273..ec3f9c93 100644 --- a/tests/minifier.spec.js +++ b/tests/minifier.spec.js @@ -2658,7 +2658,7 @@ test('conservative collapse', async () => { })).toBe(output); }); -test('collapse preseving a line break', async () => { +test('collapse preserving a line break', async () => { let input, output; input = '\n\n\n \n\n' + @@ -3594,3 +3594,54 @@ test('minify Content-Security-Policy', async () => { input = ''; expect(await minify(input)).toBe(input); }); + +test('ReDoS prevention in custom fragments processing', async () => { + // Test long sequences of whitespace that could trigger ReDoS + const longWhitespace = ' '.repeat(10000); + const phpFragments = [/<%[\s\S]*?%>/g, /<\?[\s\S]*?\?>/g]; + + // Test case 1: Long whitespace before custom fragment + const input1 = `