You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The document may have a <meta charset="..."> tag in the <head>, but that will be obsoleted as we use the parsed document, and later stringify it again. I suppose we could/should delete it from the DOM when capturing it.
Vice versa, we may want to add the appropriate <meta charset="..."> tag to the snapshot; but this seems a task for the application invoking freeze-dry, as we do not know in which encoding the application will store the string.
We could thus..
leave the snapshot without charset declaration, tell callers to add it themselves. But they won't have the parsed DOM, making this a hassle.
Easier then is to let the application tell the desired encoding tag as an option to freezeDry(...).
Alternatively, we could html-encode all characters so our string only contains plain ASCII, which I presume (rightly or wrongly?) removes the need for declaring the charset.
The text was updated successfully, but these errors were encountered:
Resolved in commit cefd79c, which adds an encoding declaration as requested by the user (the second option above), while presumptively defaulting to set it as utf-8. My reasoning as put in the commit message:
Since we return a string, how the user will encode that string should
ideally not matter to us. However, as HTML has the remarkable approach
of declaring the encoding somewhere inside the string, the user would
need to parse part of the DOM again to insert the declaration at the
right spot. If the user already knows how it will encode the string
afterward, I suppose we can help by inserting the declaration already.
In any case, we should remove any encoding declarations that the page
originally had, because the file is always reencoded.
Regarding the default action, an intuitive behaviour would be to not add
any meta tag. But because utf-8 is the most widespread and officially
recommended encoding for web documents, and also because many javascript
APIs use it as the default (or only) encoding (e.g. the Blob
constructor), it feels like a helpful default.
I suppose that snapshots have so often worked fine so far simply because many web pages have an utf-8 declaration which we did not remove, while applications (at least the WebMemex browser extension) also use utf-8 encoding.
The document may have a
<meta charset="...">
tag in the<head>
, but that will be obsoleted as we use the parsed document, and later stringify it again. I suppose we could/should delete it from the DOM when capturing it.Vice versa, we may want to add the appropriate
<meta charset="...">
tag to the snapshot; but this seems a task for the application invoking freeze-dry, as we do not know in which encoding the application will store the string.We could thus..
freezeDry(...)
.The text was updated successfully, but these errors were encountered: