Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle encoding of subresources #46

Open
Treora opened this issue Jun 1, 2019 · 0 comments
Open

Handle encoding of subresources #46

Treora opened this issue Jun 1, 2019 · 0 comments
Labels
snapshot quality Improving fidelity/size/durability/etc of the output

Comments

@Treora
Copy link
Contributor

Treora commented Jun 1, 2019

Freeze-dry messes up if a stylesheet or framed document is encoded in utf16, utf32, or possibly other encodings. We use FileReader.readAsText to decode these resources, which by default assumes utf8 encoding. This assumption is adequate most of the time, but when it isn’t the resource is effectively unreadable.

I do not know enough about the standards, but I suppose the decoder should look at the HTTP Content-Type header, the file’s byte order mark (BOM), and in-document declarations (@charset in CSS, <meta charset=…> in HTML).

This detection&decoding issue seems so generic it should not have to burden this repo, but I have not yet discovered the right tool. Some options I thought of:

  • The browser’s fetch, but unfortunately appears not to help with decoding; its Response.text() is spec'd to "return the result of running UTF-8 decode on bytes".
  • XMLHttpRequest.responseText does seem to respect HTTP header and BOM, though I am not sure about in-document declarations. And it feels a little outdated, as I think fetch was supposed to make it obsolete; but perhaps not.
  • Some javascript module? I did not yet find anything that comes close.

Tips welcome.

Note this issue is similar to issue #29, but that one concerns the DOM that the browser has already decoded for us; this issue is about subresources we fetch.

@Treora Treora added the snapshot quality Improving fidelity/size/durability/etc of the output label Jun 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
snapshot quality Improving fidelity/size/durability/etc of the output
Projects
None yet
Development

No branches or pull requests

1 participant