Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netsurf: handle encoding change parser error #154

Open
krichprollsch opened this issue Jan 12, 2024 · 1 comment
Open

netsurf: handle encoding change parser error #154

krichprollsch opened this issue Jan 12, 2024 · 1 comment
Assignees
Labels
bug Something isn't working
Milestone

Comments

@krichprollsch
Copy link
Member

When the HTML contains a META tag with a different encoding than the one used to parse the document, a c.DOM_HUBBUB_HUBBUB_ERR_ENCODINGCHANGE error is returned by netsurf.

In this case, we must restart the parsing with the new detected encoding.
The detected encoding is stored in the document and we can get it with documentGetInputEncoding().

Relates with #152 and #153

Slack discussion: https://lightpanda.slack.com/archives/C05TRU6RBM1/p1705070165019409

@krichprollsch krichprollsch added the bug Something isn't working label Jan 12, 2024
@krichprollsch krichprollsch self-assigned this Jan 12, 2024
@krichprollsch
Copy link
Member Author

A good solution would be to read the first 1kb of data from the reader, and try to extract the charset declaration in zig.
Then use it to parse the document instead of the HTTP one.

Then the parser must receive a custom reader which will contain the read buffer first and the following data after.

@francisbouvier francisbouvier added this to the Public Beta milestone Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants