Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve UTF-8 Decoding Accuracy for Binary Data #29

Open
HsuMsix opened this issue Mar 9, 2024 · 0 comments
Open

Improve UTF-8 Decoding Accuracy for Binary Data #29

HsuMsix opened this issue Mar 9, 2024 · 0 comments

Comments

@HsuMsix
Copy link

HsuMsix commented Mar 9, 2024

Problem:

Currently, the application uses String.fromCharCode.apply(null, new Uint8Array(...)) to convert binary data from network requests into strings. This approach, however, has proven to be unreliable for UTF-8 encoded text, especially when dealing with non-ASCII characters such as Chinese characters, leading to garbled text outputs.
bug

Proposed Solution:

It is suggested that we switch to using the TextDecoder API for decoding binary data. The TextDecoder interface provides a more robust mechanism for handling UTF-8 encoded text, ensuring accurate representation of all characters, including those outside the ASCII range.

For Example:

In background.js, line 109

var postedString = String.fromCharCode.apply(null,new Uint8Array(details.requestBody.raw[0].bytes));

change to

var bytes = new Uint8Array(details.requestBody.raw[0].bytes);
var decoder = new TextDecoder('utf-8');
var postedString = decoder.decode(bytes);

Results
fix

If you have any questions, please let me know. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant