Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue #20

Open
grifx opened this issue May 18, 2022 · 0 comments
Open

Encoding issue #20

grifx opened this issue May 18, 2022 · 0 comments

Comments

@grifx
Copy link

grifx commented May 18, 2022

Unsure if it can help but the character was sometimes interpreted as › on my website.

Context

The charset was not specified properly in the html document (typo). No charset in the css file either where the char was used.

Data

›   -> (UTF8)     E2   80   BA
› -> (UTF16)  00E2 20AC 00BA 

›   -> 1110 0010                     1000 0000 1011 1010
› -> 1110 0010 0010 0000 1010 1100 0000 0000 1011 1010

Again, not sure if is causing the wrong charset detection / lures the algorithm since the binary isn't matching 1:1 but since 0x80 seems to be a special value in the algorithm I thought it could be an edge-case that could be fixed.

Cheers,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant