-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stdout and stderr buffering on windows #2734
Conversation
I tested my change on wine64 ver.
On tokio v0.2.22:
On this PR's head:
That's why I think that my PR actually fixed mentioned bug. |
I feel like it should be possible to write an OS-independent test for this by making a thing that verifies that whatever is written to it is valid utf-8. |
c68ef5c
to
02c536c
Compare
tokio/src/io/stdio_common.rs
Outdated
return Poll::Ready(Err(std::io::Error::new( | ||
std::io::ErrorKind::InvalidInput, | ||
"provided buffer does not contain utf-8 data", | ||
))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original issue says:
So, windows stdout as a console only accepts utf-16 characters. Stdlib's stdout detects whether the Stdout is a console in Windows, and then it assumes the byte buffer is encoded in utf-8 and then converts it to utf-16 so it can be printed.
So stdout
might not be a console, in which case non-utf8 is ok. If the data isn't utf-8, I think we should just try to forward it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed trimming logic.
Now if buffer has utf8 error at start, or more that 8 bytes would be skipped, we assume that caller really wants to print non-utf8 data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems rather complicated. Why not just set a boolean flag if you've already printed non utf-8 data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such a flag can lead to following scenario:
- User creates a
tokio::io::Stdout
instance. - User passes this instance to a library. That library tries to write some binary data, gets an error and ignores it. Wrapper observes it and sets flag to true.
- Now user tries to write long utf8 string into the same
Stdout
. However, since flag is set, wrapper no longer trims buffer, and write operation fails.
I.e. one binary write "poisons" Stdout instance, and following legitimate "text" writes will sometimes fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the code again. You've convinced me.
So, although the code appears correct, I think it could be sped up significantly by only utf-8 checking the last few bytes of the data? |
I have opened a new issue to remove the full utf-8 check. |
Oh, I've implemented more efficient validation (which considers at most 8 final bytes IIRC), but unfortunately my laptop broke at that very time :( |
Followup: #2888 |
Motivation
Should fix #2380 (But not tested on a Windows yet)
Solution
Wrap
Blocking
in Stdout and Stderr in a specall wrapper, which is completely transparent on Linux and interceptswrite
s on Windows:It should work, because in general it is ok for
AsyncWrite::write
to write less bytes than requested.I don't think I can add test for it, because on CI stdout is not handle to console, so no panic can occur.