New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a consistent DOM snapshot (including iframes)? #3658
Comments
Hey @tolmasky, You can capture an MHTML of the page using the Will this work for you? cc +@psybuzz |
Hi @aslushnikov, Thanks for the quick reply. The I tried looking online a bit for a good description of the internals of mhtml, but haven't found anything super definitive yet (would certainly appreciate a link if it exists!). With regard to iframes and shadow dom, how are they "translated" into the single file format? Is the iframe turned into a div (with appropriate overflows/etc to simulate a replaced element?). Similarly, is the shadow dom merely inlined along with the rest of the DOM? And additionally, are scripts simply discarded (this would be idea in my case as the way I plan on showing them later I would want them to be "dead"). |
Check this out: https://goo.gl/GYT7Br
iframes are serialized as iframes. shadow DOM gets a special "shadowmode" attribute, which afaik is Chrome specific so saving page as mhtml w/ shadow DOM and opening it in Firefox will not work
Yes, this seems to be the case. Beware though: MHTML is experimental. I'd try playing around it to see how it works. |
@tolmasky It's maybe too late but it looks like SingleFile would fulfill your needs, more info here: https://github.com/gildas-lormeau/SingleFile/tree/master/cli. For your information, SingleFile serializes shadow DOM elements into iframes. It's far to be perfect but it works well for embedded tweets today. edit: it now serializes them into templates (instead of ifraes) and adds a small script to attach them to the shadow root. |
I would like to take a "snapshot" of the DOM (minus scripts), that takes into account iframe contents, ideally all at the same time. By that I mean, I would prefer not to climb the iframe tree through promises in the top level script (mainFrame().children...), which could result in different parts of the snapshot happening at different times. I'm curious if there would be a way with the current API to have on long block operation that does a "deep" outerHTML of sorts on a page (replacing iframes with a div equivalent containing its associated HTML), and removing all scripts of any sort. If not, is this something that could be considered for Chromium to enable through the protocol? Essentially want something similar to Safari's web archive feature, but generating one big string.
The text was updated successfully, but these errors were encountered: