-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
dd: get rid of line buffered stdout #10235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Line-buffered stdout causes partial write and read operations in dd, which is an issue when writing binary data to stdout. Partial writes can lead to data loss and require passing iflag=fullblock to ensure that the exact number of bytes is read.
(cherry picked from commit 0f7c531)
|
I believe we have a shared library that handles this behavior: OwnedFileDescriptorOrHandle. My understanding is that treating the Fd as a file even if its Stdout will also solve the issue of bypassing using the LineWriter. There is the performance tradeoff of cloning the Fd, but it also means we can avoid the unsafe and it wont close stdout on drop. I did a check to see if OwnedFileDescriptorOrHandle works and using the examples you provided it seems to solve the issue you are describing |
| not(target_os = "freebsd"), | ||
| feature = "printf" | ||
| ))] | ||
| fn test_no_dropped_writes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind using the test macros that we have to make this less verbose:
const BLK_SIZE: usize = 0x4000;
const COUNT: usize = 1000;
const NUM_BYTES: usize = BLK_SIZE * COUNT;
let result = new_ucmd!()
.args(&[
"if=/dev/urandom",
&format!("bs={BLK_SIZE}"),
&format!("count={COUNT}"),
])
.succeeds();
assert_eq!(result.stdout().len(), NUM_BYTES);
assert!(result.stderr_str().contains(&format!("{NUM_BYTES} bytes")));
tests/by-util/test_dd.rs
Outdated
| unix, | ||
| not(target_os = "macos"), | ||
| not(target_os = "freebsd"), | ||
| feature = "printf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this is here, is this an artifact from another test?
src/uu/dd/src/dd.rs
Outdated
| let raw_fd = io::stdout().as_raw_fd(); | ||
| unsafe { File::from_raw_fd(raw_fd) } | ||
| }; | ||
| let mut dst = Dest::Stdout(f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give more of an example of what I was thinking of the shared library that is used to bypass the LineWriter
let fx = OwnedFileDescriptorOrHandle::from(io::stdout())?;
let mut dst = Dest::Stdout(fx.into_file());
OwnedFileDescriptorOrHandle can be used to bypass the LineWriter that is used by default for Stdout.
|
Thanks for the good points. I’ve applied the comments. |
Merging this PR will degrade performance by 27.39%
Performance Changes
Comparing Footnotes
|
|
Those memory tests were just added today, can ignore them |
Line-buffered stdout causes partial write and read operations in dd,
which is an issue when writing binary data to stdout. Partial writes can
lead to data loss and require passing
iflag=fullblockto ensure that theexact number of bytes is read.
It fixes partial writes mentioned in the PR #8840 and the issue #9119
Partial writes can be reproduced by the following calls
As you can see, each call copies a different number of bytes, and in some cases a broken pipe can occur. The broken pipe happens when the second dd closes the pipe too early.
The issue with line-buffered stdout is well-known:
https://ericswpark.com/blog/2025/2025-01-23-buffering-by-block-in-rust/
For the fix I was using the workaround shared here - rust-lang/rust#58326 (comment)
This is almost my first snippet in Rust, so please don’t judge too strictly.