Description
Describe the bug
Piping when in nushell can be an order of magnitude slower than an equivalent piped command in bash
. High-throughput commands on my machine suffer a slowdown of up to 50x when compared to similar redirects in bash
or raw IO (for example redirecting to file vs. a binary that writes to the file itself).
How to reproduce
I used this Rust code to perform the measurements:
use std::io::{stdout, Write};
use std::time::{Duration, Instant};
const BUF_SIZE: usize = 8 * 1024;
const WRITE_CNT: usize = 8 * 1024;
const BYTES_LEN: usize = BUF_SIZE * WRITE_CNT;
const SAMPLES: usize = 15;
const BUF: [u8; BUF_SIZE] = [0xFF; BUF_SIZE];
fn main() {
let mut stdout = stdout().lock();
let mut samples: Vec<_> = (0..SAMPLES).map(|_| run(&mut stdout)).collect();
samples.sort();
let min = samples[0];
let max = samples[SAMPLES - 1];
let med = samples[SAMPLES / 2];
eprintln!("Elapsed: [{min:?}, {med:?}, {max:?}]");
eprintln!(
"Throughput: [{:.2} MB/s, {:.2} MB/s, {:.2} MB/s]",
thpt(min),
thpt(med),
thpt(max)
);
}
fn run<W: Write>(out: &mut W) -> Duration {
let start = Instant::now();
for _ in 0..WRITE_CNT {
out.write_all(&BUF).unwrap();
}
start.elapsed()
}
fn thpt(duration: Duration) -> f64 {
(BYTES_LEN as f64 / 1000000.0) / duration.as_secs_f64()
}
It writes 64MBs of bytes in buffers of 8KB into its standard output and measures the time it takes. We measure 15 samples to remove spurious results, and print min, median, and max results/
The overhead is most visible if the receiver of the output has high throughput – /dev/null
is hard to beat for that.
nushell
$ ./target/release/out-thpt out> /dev/null
Elapsed: [275.3485ms, 364.661123ms, 416.359778ms]
Throughput: [243.72 MB/s, 184.03 MB/s, 161.18 MB/s]
bash
$ ./target/release/out-thpt > /dev/null
Elapsed: [3.679879ms, 5.63634ms, 11.30741ms]
Throughput: [18236.70 MB/s, 11906.46 MB/s, 5934.95 MB/s]
When writing to files the same effect is seen, but with a smaller difference:
nushell
$ ./target/release/out-thpt | save out.txt -f
Elapsed: [280.598264ms, 298.941421ms, 538.216298ms]
Throughput: [239.16 MB/s, 224.49 MB/s, 124.69 MB/s]
bash
$ ./target/release/out-thpt > out.txt
Elapsed: [81.644023ms, 82.946274ms, 85.552982ms]
Throughput: [821.97 MB/s, 809.06 MB/s, 784.41 MB/s]
Another, more useful example: piping the results to a high-throughput tool like ripgrep:
nushell
$ ./target/release/out-thpt | rg 'x'
Elapsed: [190.961775ms, 278.648481ms, 652.648382ms]
Throughput: [351.43 MB/s, 240.84 MB/s, 102.83 MB/s]
bash
$ ./target/release/out-thpt | rg 'x'
Elapsed: [24.366165ms, 62.707143ms, 381.635499ms]
Throughput: [2754.18 MB/s, 1070.19 MB/s, 175.85 MB/s]
Expected behavior
I would expect nu
to perhaps have some overhead on piping, but the level of 5x, 10x, 100x slowdown is very harsh when working with large datasets.
It also creates a very unexpected slowdown when writing to file. It's suddenly much worse to use the shell's redirect to a file than a commands builtin file output (if such exists).
Screenshots
No response
Configuration
version | transpose key value | to md --pretty
key | value |
---|---|
version | 0.83.1 |
branch | |
commit_hash | |
build_os | linux-x86_64 |
build_target | x86_64-unknown-linux-gnu |
rust_version | rustc 1.71.0 (8ede3aae2 2023-07-12) |
rust_channel | stable-x86_64-unknown-linux-gnu |
cargo_version | cargo 1.71.0 (cfd3bbd8f 2023-06-08) |
build_time | 2023-08-03 13:53:50 +01:00 |
build_rust_channel | release |
allocator | standard |
features | default, sqlite, trash, which, zip |
installed_plugins |
Additional context
No response