Fix stdout and stderr buffering on windows #2734

MikailBag · 2020-07-31T19:43:05Z

Motivation

Should fix #2380 (But not tested on a Windows yet)

Solution

Wrap Blocking in Stdout and Stderr in a specall wrapper, which is completely transparent on Linux and intercepts writes on Windows:

If input buffer is larger that MAX_BUF, we shrink to to MAX_BUF.
If input buffer now has incomplete char at the end, we shrink it too.
It should work, because in general it is ok for AsyncWrite::write to write less bytes than requested.

I don't think I can add test for it, because on CI stdout is not handle to console, so no panic can occur.

MikailBag · 2020-08-07T18:17:55Z

I tested my change on wine64 ver. wine-5.0.1.
Program I used:

use std::str;
use tokio::io::{self, AsyncWriteExt};

const MAX_BUF: usize = 16 * 1024;

#[tokio::main]
async fn main() {
    assert_eq!("█".len(), 3);
    // this will actually have the size MAX_BUF * 3
    // I could only get an error being this big!
    let string = str::repeat("█", MAX_BUF);
    let mut stdout = io::stdout();
    stdout.write_all(string.as_bytes()).await.unwrap();
    stdout.flush().await.unwrap();
}

On tokio v0.2.22:

(some chars)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidData, error: "Windows stdio in console mode does not support writing non-UTF-8 byte sequences" }', src/main.rs:13:5
stack backtrace:

On this PR's head:

(some chars, in much larger amount, as expected)

That's why I think that my PR actually fixed mentioned bug.
cc @brunoczim as original reporter.

Darksonn · 2020-09-09T15:56:58Z

I feel like it should be possible to write an OS-independent test for this by making a thing that verifies that whatever is written to it is valid utf-8.

Darksonn · 2020-09-09T19:52:40Z

tokio/src/io/stdio_common.rs

+                return Poll::Ready(Err(std::io::Error::new(
+                    std::io::ErrorKind::InvalidInput,
+                    "provided buffer does not contain utf-8 data",
+                )));


The original issue says:

So, windows stdout as a console only accepts utf-16 characters. Stdlib's stdout detects whether the Stdout is a console in Windows, and then it assumes the byte buffer is encoded in utf-8 and then converts it to utf-16 so it can be printed.

So stdout might not be a console, in which case non-utf8 is ok. If the data isn't utf-8, I think we should just try to forward it anyway.

I changed trimming logic.
Now if buffer has utf8 error at start, or more that 8 bytes would be skipped, we assume that caller really wants to print non-utf8 data.

This seems rather complicated. Why not just set a boolean flag if you've already printed non utf-8 data?

I think such a flag can lead to following scenario:

User creates a tokio::io::Stdout instance.

User passes this instance to a library. That library tries to write some binary data, gets an error and ignores it. Wrapper observes it and sets flag to true.

Now user tries to write long utf8 string into the same Stdout. However, since flag is set, wrapper no longer trims buffer, and write operation fails.

I.e. one binary write "poisons" Stdout instance, and following legitimate "text" writes will sometimes fail.

I read the code again. You've convinced me.

Darksonn · 2020-09-14T10:30:51Z

So, although the code appears correct, I think it could be sped up significantly by only utf-8 checking the last few bytes of the data?

Darksonn · 2020-09-23T06:16:33Z

I have opened a new issue to remove the full utf-8 check.

MikailBag · 2020-09-23T16:43:59Z

Oh, I've implemented more efficient validation (which considers at most 8 final bytes IIRC), but unfortunately my laptop broke at that very time :(
I'll submit PR as soon as possible.

MikailBag · 2020-09-26T21:09:30Z

Followup: #2888

Darksonn added A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-io Module: tokio/io labels Jul 31, 2020

carllerche requested a review from Darksonn September 9, 2020 03:56

MikailBag added 5 commits September 9, 2020 21:10

Fix stdout and stderr buffering on windows

0aacc3f

fix the fix

e03e74e

fix fmt

13a13d6

fix missing else

70efc5a

Add test

02c536c

MikailBag force-pushed the win-console-large branch from c68ef5c to 02c536c Compare September 9, 2020 18:42

MikailBag added 3 commits September 9, 2020 21:43

cleanup

f9475f8

we do not need io driver

29407d4

skip test on loom

76c8e11

Darksonn reviewed Sep 9, 2020

View reviewed changes

fix possible false error

0580f63

Darksonn approved these changes Sep 23, 2020

View reviewed changes

Darksonn merged commit 555b74c into tokio-rs:master Sep 23, 2020

Darksonn mentioned this pull request Sep 23, 2020

Improve stdout/stderr utf-8 handling performance on windows #2863

Closed

Darksonn mentioned this pull request Sep 23, 2020

io: move #[cfg(not(loom))] to fix warning #2864

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix stdout and stderr buffering on windows #2734

Fix stdout and stderr buffering on windows #2734

MikailBag commented Jul 31, 2020

MikailBag commented Aug 7, 2020 •

edited

Loading

Darksonn commented Sep 9, 2020

Darksonn Sep 9, 2020

MikailBag Sep 9, 2020

Darksonn Sep 9, 2020

MikailBag Sep 9, 2020 •

edited

Loading

Darksonn Sep 9, 2020

Darksonn commented Sep 14, 2020

Darksonn commented Sep 23, 2020

MikailBag commented Sep 23, 2020

MikailBag commented Sep 26, 2020

Fix stdout and stderr buffering on windows #2734

Fix stdout and stderr buffering on windows #2734

Conversation

MikailBag commented Jul 31, 2020

Motivation

Solution

MikailBag commented Aug 7, 2020 • edited Loading

Darksonn commented Sep 9, 2020

Darksonn Sep 9, 2020

Choose a reason for hiding this comment

MikailBag Sep 9, 2020

Choose a reason for hiding this comment

Darksonn Sep 9, 2020

Choose a reason for hiding this comment

MikailBag Sep 9, 2020 • edited Loading

Choose a reason for hiding this comment

Darksonn Sep 9, 2020

Choose a reason for hiding this comment

Darksonn commented Sep 14, 2020

Darksonn commented Sep 23, 2020

MikailBag commented Sep 23, 2020

MikailBag commented Sep 26, 2020

MikailBag commented Aug 7, 2020 •

edited

Loading

MikailBag Sep 9, 2020 •

edited

Loading