-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add browser-style text encoding detection #203
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTML is never assumed to be UTF-8, that has to be declared explicitly. This is a bit iffy, it'll definitely break some real world output. But it's consistent with how browsers handle it. See https://hsivonen.fi/utf-8-detection/.
One argument against this is that fetch and XHR can assume UTF-8 even if loading as a normal webpage can't. What do you think?
Being consistent with how browsers handle it seems reasonable to me. If has_charset_utf8()
and chardetng
both fail, users will always be able to set the charset with --response-charset
.
I think I changed my mind about HTML and UTF-8. HTTPie specializes in APIs, and you'd usually talk to those using some method that can assume UTF-8. |
82e2729
to
994676a
Compare
Can you update the commit description, especially the bit about HTML and UTF-8? |
- If there's no explicit encoding it's detected using BOM sniffing or using chardetng, which was made for Firefox. - BOM sniffing is no longer used when the encoding is explicit. - Streaming and non-streaming decodes now behave the same.
994676a
to
adcd194
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved 🎉
Can you also update the Pull request's description? thanks
One argument against this is that fetch and XHR can assume UTF-8 even if loading as a normal webpage can't. What do you think?
I had to do a bunch of research for this one, I hope it's not too hard to follow.