Remove default styles #71

joswhite · 2022-05-11T12:56:02Z

getComputedStyle returns about 340 CSS styles for an element, almost all of which are default styles. According to my tests, omitting these default styles from the raster image captures makes screenshots 60% faster and shortens the image data URIs by 11x in length. (By "raster" I mean the types of captures not requiring CSS reduction, i.e. all methods besides toSvg, which shortens CSS styles even further but takes much longer to do so).

This is similar to PR #37 (which added the CSS reduction for toSvg), but the code caches the default styles for each type of HTML element, so the calculations are significantly faster and actually speed up screen captures.

This would close #70. It also helps close dom-to-image #245.

Performance comparisons according to local tests:

Test case: Capture a webpage with 5675 HTML elements using toBlob, which generates a raster image (does not use CSS reduction).

Baseline: The task took 7.9 seconds (average between Firefox, Chrome, and Edge), which involved copying 1929500 styles (340 per element) from the webpage to the clone being captured. The clone's data URI length was 32561735 characters (31.1 MB).

With this PR: The task takes 3.1 seconds, which involves copying 39051 styles (6.9 per element) from the webpage to the clone. The clone's data URI length is 2843600 characters (2.7 MB). This is 60% faster, has 49x less styles to copy, and the data URI length is 11x shorter in length.

Test case: Same as above, but capture using toSvg, which generates a non-raster image (uses CSS reduction). This task uses the CSS reduction code from PR #37, which has not been altered significantly by the PR. The baseline, as well as the performance with this PR is the same: The task takes 38.5 seconds, which involves copying 24374 styles (3.5 per element) from the webpage to the clone being captured. The clone's data URI length is 2174916 characters (2.1 MB), which is 15x shorter than the toBlob baseline, at the expense of 5x longer compute time.

Implementation details

For the raster image captures (all methods besides toSvg), the code uses a new copyUserComputedStyleFast function to copy styles from an HTML node to the clone. copyUserComputedStyleFast first obtains the default styles for HTML elements with the same tag name as the node being copied. (It does this by querying the computed styles for an HTML element within a hidden <iframe>. The default styles are identical to the computed styles, except an adjustment we make for width and height - see note [1] below.) It compares the node with the default element styles, then only copies the styles that differ to the clone.

For the non-raster image captures (toSvg), the code uses a new copyUserComputedStyle function in place of the previous getUserComputedStyle. The previous function obtained the non-default styles for an HTML element, then assigned these styles to a dummy DOM element so it could return them. The new function is equivalent, except it no longer needs to assign styles to a dummy DOM element since it only copies (not returns) styles. This might be a tad bit faster, although performance was identical in my tests.

One more minor change: Before this PR, toCanvas used CSS reduction, which I assumed was accidental, so I changed it to not require CSS reduction.

How does the Node environment differ from the browser environment? I'm asking because the code uses iframe.contentWindow.document.body and iframe.contentWindow.getComputedStyle while getting the default styles, which works great on Firefox, Edge, and Chrome (see MDN). I'm not sure whether a different syntax is needed for Node. Feel free to change any implementation details of this PR. Thanks for all your work on this amazing library!

[1]: For block elements, width, height, block-size, inset-block, transform-origin, and perspective-origin are "special" in that the initial and computed values don't match. Let me explain. For example, for width and height, the default style (initial value) is always auto. However, the computed value (as returned by getComputedStyle) is an absolute value measured in px. If the code returned this as the default style, then it wouldn't clone width for certain nodes (block elements with a set width that matched the measurement), which would be a bug. The same goes for height and the other 4 CSS properties listed above. To remedy this, the code returns auto as the default style for width and height, since that is always the initial value. The other 4 CSS properties are used infrequently, and the likelihood that they would be used and happen to match the measurement is very slim, so the code doesn't remedy the issue for them. If you do want to make the code more robust to accommodate these (very rare) situations, let me know and I'd be happy to change the PR to return the initial value for these other 4 properties as well. This could be done programmatically, for example, by simply creating two HTML elements in the hidden <iframe> instead of one, of different sizes, and only returning the default styles that matched between the two elements.

getComputedStyle returns about 340 CSS styles, almost all of which are default styles. Omitting these default styles from the raster image computations makes screenshots 60% faster and shortens the image data URIs by 11x in length.

`raster=true` is now applied to all methods besides `toSvg`, which was the original intention.

IDisposable · 2022-05-12T10:05:31Z

src/dom-to-image-more.js

+            sandbox.style.position = 'fixed';
+            document.body.appendChild(sandbox);
+            // Ensure that the iframe is rendered in standard mode
+            sandbox.contentWindow.document.write('<!DOCTYPE html><meta charset="UTF-8"><title>sandbox</title><body>');


I wonder if it's worthwhile reaching up to ensure the meta charset assumption is valid

Why not? Let's change this to

sandbox.contentWindow.document.write('<!DOCTYPE html><meta charset="' + (document.characterSet || 'UTF-8') + '"><title>sandbox</title><body>');

According to tests with the latest Chrome, Edge, and Firefox, document.characterSet is always defined, so we could use document.characterSet in place of (document.characterSet || 'UTF-8'), but I haven't been able to test earlier versions of these browsers so keeping the default as UTF-8 seems to be a good idea.

Do we need a PR for this or would you like to change it?

IDisposable · 2022-05-12T10:06:27Z

src/dom-to-image-more.js

+        var defaultElement = document.createElement(tagName);
+        sandbox.contentWindow.document.body.appendChild(defaultElement);
+        // Ensure that there is some content, so that properties like margin are applied.
+        defaultElement.textContent = '.';


Maybe set this to a zero-width non-breaking space instead?

As in defaultElement.textContent = ' ';?

For block elements like div, that results in the defaultElement having a height of 0px but still having a width. Since we override height to auto, it only changes the block-size, persective-origin, and transform-origin being returned on the default style, to a different number. For inline elements like span, that results in defaultElement having a width of 0px but still having a height. In this case it only changes persective-origin and transform-origin being returned on the default style.

Not sure whether this is desirable or not. Seems to be equivalent. The element being cloned would need to have the CSS property set, and the value would have to match the defaultElement computed value, in order for it to cause problems (i.e. omitting the style from the cloned HTML). Very slim chance this would happen.

I'm inclined to stick with some text content (rather than spaces), due to the StackOverflow answer from which I derived this section of code: https://stackoverflow.com/questions/42025329/how-to-get-the-applied-style-from-an-element/42068963#42068963. In his code snippet the author includes

// ensure that there is some content, so that e.g. margin is applied elVanilla.textContent = 'foo';

I'm fine with either though. Feel free to change it to a single space if you'd like.

A small PITA from using .... it's the start of a CSS class in a <style> tag, or a invalid dot operator in a embedded <script> tag. I plan to change it to ; as a minimum in #102.

IDisposable · 2022-05-12T10:10:20Z

Going to merge this as-is, but would appreciate further discussion on the two comments noted.

IDisposable · 2022-05-12T10:46:12Z

Released in 2.10.1

joswhite · 2022-05-12T15:28:30Z

Thanks for merging this!

zm-cttae · 2022-12-19T17:20:17Z

I've identified a regression in the original implementation of copyUserComputedStyle - #90.

The other methods were self-cleaning since 1904labs#71

joswhite added 2 commits May 10, 2022 19:56

Make toCanvas not use CSS reduction

673851b

`raster=true` is now applied to all methods besides `toSvg`, which was the original intention.

IDisposable reviewed May 12, 2022

View reviewed changes

IDisposable merged commit 6a72047 into 1904labs:master May 12, 2022

zm-cttae mentioned this pull request Dec 19, 2022

toSvg regression for explicit CSS values equal to inherit #90

Closed

2 tasks

zm-cttae mentioned this pull request Dec 21, 2022

[Question] Is that possible to run dom-to-image on NodeJS? #48

Open

zm-cttae mentioned this pull request Jan 7, 2023

Trusted Type API error for pages that rely on a private policy #99

Closed

2 tasks

joswhite deleted the remove-default-styles branch January 30, 2023 12:30

zm-cttae added a commit to zm-cttae/dom-to-image-more that referenced this pull request Jan 30, 2023

[nit] Remove sandbox iframe for toSvg

878f66c

The other methods were self-cleaning since 1904labs#71

zm-cttae mentioned this pull request Jan 30, 2023

Fix quirks mode regression #110

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove default styles #71

Remove default styles #71

joswhite commented May 11, 2022

IDisposable May 12, 2022

joswhite May 12, 2022 •

edited

IDisposable May 12, 2022

joswhite May 12, 2022

zm-cttae Jan 11, 2023

IDisposable commented May 12, 2022

IDisposable commented May 12, 2022

joswhite commented May 12, 2022

zm-cttae commented Dec 19, 2022

Remove default styles #71

Remove default styles #71

Conversation

joswhite commented May 11, 2022

IDisposable May 12, 2022

Choose a reason for hiding this comment

joswhite May 12, 2022 • edited

Choose a reason for hiding this comment

IDisposable May 12, 2022

Choose a reason for hiding this comment

joswhite May 12, 2022

Choose a reason for hiding this comment

zm-cttae Jan 11, 2023

Choose a reason for hiding this comment

IDisposable commented May 12, 2022

IDisposable commented May 12, 2022

joswhite commented May 12, 2022

zm-cttae commented Dec 19, 2022

joswhite May 12, 2022 •

edited