New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tool_cb_wrt: fix invalid unicode for windows console #10890
Conversation
I can't compile the binary, but I can check the test binary if needed. Maybe there is a link to the compiled version? |
You can download artifacts from the most recent appveyor CI jobs, for example https://ci.appveyor.com/project/curlorg/curl/builds/46705722 There are other ones you can download that do not require Visual C++ debugging DLLs but you may need other DLLs like OpenSSL. |
There are no more artifacts, everything is fine! thank you!!! (I updated the answer because I tested the build incorrectly first.) |
The bug is defeated :-) I hope we'll see the fix in the release soon, thanks! |
bdcbb30
to
eb9cc11
Compare
- Suppress an incomplete UTF-8 sequence at the end of the buffer. - Attempt to reconstruct incomplete UTF-8 sequence from prior call(s) in current call. Prior to this change, in Windows console, UTF-8 sequences split between two or more calls to the write callback would cause invalid "replacement characters" U+FFFD to be printed instead of the actual Unicode character. This is because in Windows only UTF-16 encoded characters are printed to the console, therefore we convert the UTF-8 contents to UTF-16, which cannot be done with partial UTF-8 sequences. Reported-by: Maksim Arhipov Fixes curl#9841 Closes #xxxx
Why this fix was not added in the latest update ? Can no one cope with this? |
This was held up because I don't have a test for it. I will give it another look to see if we can find some way to test for it. Our test apparatus doesn't cover situations like this. I think this is a worthwhile addition and that I may have to unfortunately add it without a test. |
- Suppress an incomplete UTF-8 sequence at the end of the buffer. - Attempt to reconstruct incomplete UTF-8 sequence from prior call(s) in current call. Prior to this change, in Windows console UTF-8 sequences split between two or more calls to the write callback would cause invalid "replacement characters" U+FFFD to be printed instead of the actual Unicode character. This is because in Windows only UTF-16 encoded characters are printed to the console, therefore we convert the UTF-8 contents to UTF-16, which cannot be done with partial UTF-8 sequences. Reported-by: Maksim Arhipov Fixes curl#9841 Closes curl#10890
Suppress an incomplete UTF-8 sequence at the end of the buffer.
Attempt to reconstruct incomplete UTF-8 sequence from prior call(s) in current call.
Prior to this change, in Windows console, UTF-8 sequences split between two or more calls to the write callback would cause invalid "replacement characters" U+FFFD to be printed instead of the actual Unicode character. This is because in Windows only UTF-16 encoded characters are printed to the console, therefore we convert the UTF-8 contents to UTF-16, which cannot be done with partial UTF-8 sequences.
Reported-by: Maksim Arhipov
Fixes #9841
Closes #xxx
Untested, WIP. Also I have no automated way to test for the issue this fixes because it only happens in the Windows console.