Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture dynamically inserted CSS rules #45

Open
Treora opened this issue May 11, 2019 · 5 comments
Open

Capture dynamically inserted CSS rules #45

Treora opened this issue May 11, 2019 · 5 comments
Labels
snapshot quality Improving fidelity/size/durability/etc of the output

Comments

@Treora
Copy link
Contributor

Treora commented May 11, 2019

Scripts can modify stylesheets using the CSSOM functions like CSSStylesheet.insertRule(), used by e.g. emotion.js.

The contents of a <style> element appear to not be updated to reflect the new rules, so I suppose we will thus have to do this ourselves, by going through the rules using CSSOM. Or we take another approach altogether to preserve styles.

In the wild, I observed the problem with images on NYTimes articles, which become a mess (they use emotion.js).

@Treora Treora added the snapshot quality Improving fidelity/size/durability/etc of the output label May 11, 2019
@Gozala
Copy link
Contributor

Gozala commented Mar 10, 2020

I was about to report this (as I keep running into it a lot) and turns out there is already an issue. I was wondering if capturing styling by serializing document.styleSheets had being considered at all. Potentially it could fall back to fetching contents for styleSheets who's rules can't be accessed at runtime.

I think this also has benefit of not needing to deal with import rules and etc.. as I believe imported sheets would also be listed.

@Gozala
Copy link
Contributor

Gozala commented Mar 10, 2020

Alternative strategy might be to inline all the styles into elements themselves by obtaining those through getComputedStyle API. But that is likely to increase overall page size.

@Treora
Copy link
Contributor Author

Treora commented Mar 10, 2020

I was wondering if capturing styling by serializing document.styleSheets had being considered at all.

Yes, something to be tried out. I have not investigated how rich the CSSOM is, i.e. whether it lets us easily serialise the stylesheet again to get a string; ideally it would have (or we could create) an equivalent of Element.outerHTML. But I vaguely remember that unlike with the HTML DOM, the comments cannot be accessed in CSSOM, quite unfortunately! (I think that put me off originally, opting to use postcss instead)

If we could always use the CSSOM, we could even remove the hassle with (and dependency on) post-css, which is now used to parse stylesheets in extract-links/from-css.js.

But as you say, we may need to fall back to fetching the contents, if the rules cannot be accessed through the CSSOM. Similar to how we deal with frames.

@Treora
Copy link
Contributor Author

Treora commented Mar 10, 2020

I think this also has benefit of not needing to deal with import rules and etc.. as I believe imported sheets would also be listed.

You mean we would, for example, create two <link …> elements, instead of keeping one link to a sheet that imports another? This could be an optimisation to flatten the resource tree (and in the current data: URL approach, save space by avoiding base64-encoding an already base64-encoded string).

However, this approach would also manipulate the DOM unnecessarily. For various use cases, keeping things as equal as possible is desirable (e.g. for archiving or to keep xpath pointers intact). Moreover, some tricks would be needed to fix semantics; for example, if an stylesheet <link href="print.css" rel="stylesheet" media="print"> has an @import (orientation.landscape) we should ensure that the newly created <link> (or style) element combines both media queries.

So I would probably opt to just leave the import structure intact. I do not see a difficulty in dealing with import rules (they have a .styleSheet property to walk down the CSSOM), but I may miss somtehing. Instead of using document.styleSheets, we could perhaps query all the style and link[rel~=stylesheet i] elements, and then take their .sheet property.

By the way, one reason for reading document.styleSheets: I see here that it also includes stylesheets defined in an HTTP Link header. Not that anybody uses that (I think?), but it may nice to support some day.

@Gozala
Copy link
Contributor

Gozala commented Mar 10, 2020

You mean we would, for example, create two <link …> elements, instead of keeping one link to a sheet that imports another? This could be an optimisation to flatten the resource tree (and in the current data: URL approach, save space by avoiding base64-encoding an already base64-encoded string).

However, this approach would also manipulate the DOM unnecessarily. For various use cases, keeping things as equal as possible is desirable (e.g. for archiving or to keep xpath pointers intact). Moreover, some tricks would be needed to fix semantics; for example, if an stylesheet <link href="print.css" rel="stylesheet" media="print"> has an @import (orientation.landscape) we should ensure that the newly created <link> (or style) element combines both media queries.

I don't have specifics I just noticed that there is ownerRule property on CSSStyleSheet that according to MDN:

The read-only CSSStyleSheet property ownerRule returns the CSSImportRule corresponding to the @import at-rule which imported the stylesheet into the document. If the stylesheet wasn't imported into the document using @import, the returned value is null.

Which seems to imply that imported sheets would also be in the document.styleSheets but may not be in HTML tree.

I guess you don't necessarily need to create link elements for imported sheets. Imported sheets could be just serialized and and import rule could be updated instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
snapshot quality Improving fidelity/size/durability/etc of the output
Projects
None yet
Development

No branches or pull requests

2 participants