Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting a default encoding ignores encoding set by web page #19

Open
Polda18 opened this issue Sep 29, 2020 · 1 comment
Open

Setting a default encoding ignores encoding set by web page #19

Polda18 opened this issue Sep 29, 2020 · 1 comment

Comments

@Polda18
Copy link

Polda18 commented Sep 29, 2020

Hello. I installed this addon as a replacement to previous addon I had because it actually allows me to set a default encoding for pages that don't have a default encoding set up and end up displaying incorrectly on Windows machines because it uses UTF-8 encoding (typically text files have this issue a lot). But I learned that it ignores the encoding set by HTML tag <meta charset="utf-8" /> where the encoding used in the charset can be whatever. I've set the default encoding to UTF-8, because that's what's mostly used, and Google Chrome on Windows defaults to Windows-1250 (Central Europe) I guess, which isn't ideal if the page isn't configured properly. However, I'd like to keep the charset on the page as defined and not get overwritten by my default encoding. Most pages get unaffected, because it uses the same encoding (UTF-8), but some pages use different encoding, which may be like Windows-1250 or ISO-8859-2 (both Central Europe languages). Right now, it ignores the encoding set by page and displays the page in wrong encoding, which makes the text corruped.

@jinliming2
Copy link
Owner

Hi, charset set in the <meta> tag is not reliable.
The <meta> tag can be readable only if the HTML file was stored as ASCII-compatible encoding. If the HTML file doesn't load in the correct encoding, the <meta> tag may be unreadable, then charset set in the <meta> tag will be ignored.

Currently, this extension modifies the charset tag in HTTP response header before the web page start loading, then the HTML content will load using the specified encoding.
When the character encoding is being modified, the HTML content has not been loaded yet. It's may not possible to modify the character encoding after the web page start loading.
So, it's hard to detect whether the charset was setten in the <meta> tag.

Any suggestion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants