Skip to content

cat: improve splice fast-path#11517

Open
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:splice-pipe
Open

cat: improve splice fast-path#11517
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:splice-pipe

Conversation

@oech3
Copy link
Copy Markdown
Contributor

@oech3 oech3 commented Mar 27, 2026

Fixes #11516

$ yes-gf421d0112 | cat-p | pv >/dev/null
^C.4GiB 0:00:04 [23.1GiB/s]
$ yes-gf421d0112 | cat | pv >/dev/null
^C.6GiB 0:00:05 [6.33GiB/s]

@oech3 oech3 force-pushed the splice-pipe branch 2 times, most recently from 73fdb00 to 75c98dc Compare March 27, 2026 04:31
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/basenc/bounded-memory is now being skipped but was previously passing.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 27, 2026

Merging this PR will improve performance by 13.59%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 299 untouched benchmarks
⏩ 46 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory cp_recursive_deep_tree[(120, 4)] 699.2 KB 615.6 KB +13.59%

Comparing oech3:splice-pipe (e4331a4) with main (af6c8f2)2

Open in CodSpeed

Footnotes

  1. 46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (2b05f2f) during the generation of this report, so af6c8f2 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@oech3 oech3 mentioned this pull request Mar 27, 2026
@oech3 oech3 marked this pull request as ready for review March 27, 2026 04:59
@oech3 oech3 force-pushed the splice-pipe branch 2 times, most recently from e39243b to e4331a4 Compare March 27, 2026 08:54
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/pipe-f is now passing!

@oech3
Copy link
Copy Markdown
Contributor Author

oech3 commented Mar 27, 2026

truncate -s 9EB huge; cat-p huge | pv>/dev/null is 3~10GiB/s faster than zero-copy yes now...

$ taskset -c 1 cat-p huge |taskset -c 2 pv>/dev/null
^C39GiB 0:00:06 [40.0GiB/s]
$ taskset -c 1 cat huge |taskset -c 2 pv>/dev/null
^C92GiB 0:00:15 [20.7GiB/s]

@oech3
Copy link
Copy Markdown
Contributor Author

oech3 commented Mar 29, 2026

I have no idea how to achieve yes which has same perf with cat sparse-huge.
I considered that tee()ing 1 MiB pipe with references of y is same with splice()ing a zero page, but it has simllar perf with real 1 MiB y.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cat: splice fast-path is slower than GNU yes

2 participants