New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove default styles #71
Conversation
getComputedStyle returns about 340 CSS styles, almost all of which are default styles. Omitting these default styles from the raster image computations makes screenshots 60% faster and shortens the image data URIs by 11x in length.
`raster=true` is now applied to all methods besides `toSvg`, which was the original intention.
sandbox.style.position = 'fixed'; | ||
document.body.appendChild(sandbox); | ||
// Ensure that the iframe is rendered in standard mode | ||
sandbox.contentWindow.document.write('<!DOCTYPE html><meta charset="UTF-8"><title>sandbox</title><body>'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's worthwhile reaching up to ensure the meta charset
assumption is valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? Let's change this to
sandbox.contentWindow.document.write('<!DOCTYPE html><meta charset="' +
(document.characterSet || 'UTF-8') + '"><title>sandbox</title><body>');
According to tests with the latest Chrome, Edge, and Firefox, document.characterSet
is always defined, so we could use document.characterSet
in place of (document.characterSet || 'UTF-8')
, but I haven't been able to test earlier versions of these browsers so keeping the default as UTF-8 seems to be a good idea.
Do we need a PR for this or would you like to change it?
var defaultElement = document.createElement(tagName); | ||
sandbox.contentWindow.document.body.appendChild(defaultElement); | ||
// Ensure that there is some content, so that properties like margin are applied. | ||
defaultElement.textContent = '.'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set this to a zero-width non-breaking space instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in defaultElement.textContent = ' ';
?
For block elements like div
, that results in the defaultElement having a height of 0px
but still having a width. Since we override height to auto
, it only changes the block-size
, persective-origin
, and transform-origin
being returned on the default style, to a different number. For inline elements like span
, that results in defaultElement having a width of 0px
but still having a height. In this case it only changes persective-origin
and transform-origin
being returned on the default style.
Not sure whether this is desirable or not. Seems to be equivalent. The element being cloned would need to have the CSS property set, and the value would have to match the defaultElement
computed value, in order for it to cause problems (i.e. omitting the style from the cloned HTML). Very slim chance this would happen.
I'm inclined to stick with some text content (rather than spaces), due to the StackOverflow answer from which I derived this section of code: https://stackoverflow.com/questions/42025329/how-to-get-the-applied-style-from-an-element/42068963#42068963. In his code snippet the author includes
// ensure that there is some content, so that e.g. margin is applied
elVanilla.textContent = 'foo';
I'm fine with either though. Feel free to change it to a single space if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small PITA from using .
... it's the start of a CSS class in a <style>
tag, or a invalid dot operator in a embedded <script>
tag. I plan to change it to ;
as a minimum in #102.
Going to merge this as-is, but would appreciate further discussion on the two comments noted. |
Released in 2.10.1 |
Thanks for merging this! |
I've identified a regression in the original implementation of |
The other methods were self-cleaning since 1904labs#71
getComputedStyle returns about 340 CSS styles for an element, almost all of which are default styles. According to my tests, omitting these default styles from the raster image captures makes screenshots 60% faster and shortens the image data URIs by 11x in length. (By "raster" I mean the types of captures not requiring CSS reduction, i.e. all methods besides
toSvg
, which shortens CSS styles even further but takes much longer to do so).This is similar to PR #37 (which added the CSS reduction for
toSvg
), but the code caches the default styles for each type of HTML element, so the calculations are significantly faster and actually speed up screen captures.This would close #70. It also helps close dom-to-image #245.
Performance comparisons according to local tests:
Test case: Capture a webpage with 5675 HTML elements using
toBlob
, which generates a raster image (does not use CSS reduction).Baseline: The task took 7.9 seconds (average between Firefox, Chrome, and Edge), which involved copying 1929500 styles (340 per element) from the webpage to the clone being captured. The clone's data URI length was 32561735 characters (31.1 MB).
With this PR: The task takes 3.1 seconds, which involves copying 39051 styles (6.9 per element) from the webpage to the clone. The clone's data URI length is 2843600 characters (2.7 MB). This is 60% faster, has 49x less styles to copy, and the data URI length is 11x shorter in length.
Test case: Same as above, but capture using
toSvg
, which generates a non-raster image (uses CSS reduction). This task uses the CSS reduction code from PR #37, which has not been altered significantly by the PR. The baseline, as well as the performance with this PR is the same: The task takes 38.5 seconds, which involves copying 24374 styles (3.5 per element) from the webpage to the clone being captured. The clone's data URI length is 2174916 characters (2.1 MB), which is 15x shorter than thetoBlob
baseline, at the expense of 5x longer compute time.Implementation details
For the raster image captures (all methods besides
toSvg
), the code uses a newcopyUserComputedStyleFast
function to copy styles from an HTML node to the clone.copyUserComputedStyleFast
first obtains the default styles for HTML elements with the same tag name as the node being copied. (It does this by querying the computed styles for an HTML element within a hidden<iframe>
. The default styles are identical to the computed styles, except an adjustment we make forwidth
andheight
- see note [1] below.) It compares the node with the default element styles, then only copies the styles that differ to the clone.For the non-raster image captures (
toSvg
), the code uses a newcopyUserComputedStyle
function in place of the previousgetUserComputedStyle
. The previous function obtained the non-default styles for an HTML element, then assigned these styles to a dummy DOM element so it could return them. The new function is equivalent, except it no longer needs to assign styles to a dummy DOM element since it only copies (not returns) styles. This might be a tad bit faster, although performance was identical in my tests.One more minor change: Before this PR,
toCanvas
used CSS reduction, which I assumed was accidental, so I changed it to not require CSS reduction.How does the Node environment differ from the browser environment? I'm asking because the code uses
iframe.contentWindow.document.body
andiframe.contentWindow.getComputedStyle
while getting the default styles, which works great on Firefox, Edge, and Chrome (see MDN). I'm not sure whether a different syntax is needed for Node. Feel free to change any implementation details of this PR. Thanks for all your work on this amazing library![1]: For block elements,
width
,height
,block-size
,inset-block
,transform-origin
, andperspective-origin
are "special" in that the initial and computed values don't match. Let me explain. For example, forwidth
andheight
, the default style (initial value) is alwaysauto
. However, the computed value (as returned bygetComputedStyle
) is an absolute value measured inpx
. If the code returned this as the default style, then it wouldn't clonewidth
for certain nodes (block elements with a set width that matched the measurement), which would be a bug. The same goes forheight
and the other 4 CSS properties listed above. To remedy this, the code returnsauto
as the default style forwidth
andheight
, since that is always the initial value. The other 4 CSS properties are used infrequently, and the likelihood that they would be used and happen to match the measurement is very slim, so the code doesn't remedy the issue for them. If you do want to make the code more robust to accommodate these (very rare) situations, let me know and I'd be happy to change the PR to return the initial value for these other 4 properties as well. This could be done programmatically, for example, by simply creating two HTML elements in the hidden<iframe>
instead of one, of different sizes, and only returning the default styles that matched between the two elements.