-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[libc++][format] Adds print benchmarks. #129765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
P3107R5 "Permit an efficient implementation of std::print" should improve the performance of std::print. Before implementing these improvements it adds the benchmark provided in the paper. It replaces the `fmt::memory_buffer` with `std::vector<char>`, which is not a stack buffer. However there is not a good Standard type. (std::inplace_vector would be a better type.)
@llvm/pr-subscribers-libcxx Author: Mark de Wever (mordante) ChangesP3107R5 "Permit an efficient implementation of std::print" should improve the performance of std::print. Before implementing these improvements it adds the benchmark provided in the paper. It replaces the Full diff: https://github.com/llvm/llvm-project/pull/129765.diff 1 Files Affected:
diff --git a/libcxx/test/benchmarks/print.bench.cpp b/libcxx/test/benchmarks/print.bench.cpp
new file mode 100644
index 0000000000000..feed14138b9d9
--- /dev/null
+++ b/libcxx/test/benchmarks/print.bench.cpp
@@ -0,0 +1,79 @@
+//===----------------------------------------------------------------------===//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// REQUIRES: std-at-least-c++23
+
+// This benchmark writes the print and benchmark data to stdout. In order to
+// preserve the benchmark output it needs to be stored in a file (using the
+// console format). Another issue with the benchmark is the time it takes to
+// write output to the real terminal. In order to avoid that overhead write the
+// output to a fast "terminal", like /dev/null. For example, the printf
+// console 1546 ns
+// /dev/null 70.9 ns
+// An example of a good test invocation.
+// BENCHMARK_OUT=benchmark.txt BENCHMARK_OUT_FORMAT=console <exe> >/dev/null
+
+#include <print>
+
+#include <cstdio>
+#include <string>
+#include <vector>
+
+#include "benchmark/benchmark.h"
+
+void printf(benchmark::State& s) {
+ while (s.KeepRunning())
+ std::printf("The answer to life, the universe, and everything is %d.\n", 42);
+}
+BENCHMARK(printf);
+
+void vprint_string(std::string_view fmt, std::format_args args) {
+ auto s = std::vformat(fmt, args);
+ std::size_t result = fwrite(s.data(), 1, s.size(), stdout);
+ if (result < s.size())
+ throw std::format_error("fwrite error");
+}
+
+template <typename... T>
+void print_string(std::format_string<T...> fmt, T&&... args) {
+ vprint_string(fmt.get(), std::make_format_args(args...));
+}
+
+void print_string(benchmark::State& s) {
+ while (s.KeepRunning()) {
+ print_string("The answer to life, the universe, and everything is {}.\n", 42);
+ }
+}
+BENCHMARK(print_string);
+
+void vprint_stack(std::string_view fmt, std::format_args args) {
+ auto buf = std::vector<char>{};
+ std::vformat_to(std::back_inserter(buf), fmt, args);
+ std::size_t result = fwrite(buf.data(), 1, buf.size(), stdout);
+ if (result < buf.size())
+ throw std::format_error("fwrite error");
+}
+
+template <typename... T>
+void print_stack(std::format_string<T...> fmt, T&&... args) {
+ vprint_stack(fmt.get(), std::make_format_args(args...));
+}
+
+void print_stack(benchmark::State& s) {
+ while (s.KeepRunning()) {
+ print_stack("The answer to life, the universe, and everything is {}.\n", 42);
+ }
+}
+BENCHMARK(print_stack);
+
+void print_direct(benchmark::State& s) {
+ while (s.KeepRunning())
+ std::print("The answer to life, the universe, and everything is {}.\n", 42);
+}
+BENCHMARK(print_direct);
+
+BENCHMARK_MAIN();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding benchmarks for this. It's nice to add the benchmarks first and then be able to see the difference with the upcoming patch!
// preserve the benchmark output it needs to be stored in a file (using the | ||
// console format). Another issue with the benchmark is the time it takes to | ||
// write output to the real terminal. In order to avoid that overhead write the | ||
// output to a fast "terminal", like /dev/null. For example, the printf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the printf
This seems like an unterminated sentence?
// An example of a good test invocation. | ||
// BENCHMARK_OUT=benchmark.txt BENCHMARK_OUT_FORMAT=console <exe> >/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// An example of a good test invocation. | |
// BENCHMARK_OUT=benchmark.txt BENCHMARK_OUT_FORMAT=console <exe> >/dev/null | |
// An example of a good test invocation. | |
// BENCHMARK_OUT=benchmark.txt BENCHMARK_OUT_FORMAT=console <exe> >/dev/null |
What do you mean here? The canonical way of running benchmarks is
./libcxx/utils/libcxx-lit <build> -sv --param optimization=speed --show-all <benchmark>
Is this not how this benchmark should be run? If we should be adding additional environment variables to the benchmark when running it, we should find a way for a vanilla lit
invocation to do it properly out of the box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed the case. This tests std::print
which writes to stdout
, which means part of the benchmark is benchmarking the terminal. Therefore this is needed. I'll have a look at it.
This patch was originally written a while ago for our non-lit benchmarking so I haven't investigated the lit invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we could potentially do is ensure that any RUN:
lines are still executed, and do something like RUN: export BENCHMARK_OUT=%t/something
. Or we could have something like ADDITIONAL_COMPILER_FLAGS
but that allows adding environment variables. Or maybe something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could add a new lit keyword. The issue with this test is the same for other potential benchmarks that write to the standard output facilities. So something along the lines of 'BENCHMARK_USES_TERMINAL. Then we can parse this in
format.py. Typically we parse this in
parseScript, here I think it makes sense to parse in
execute` and based on its presence use the current step or do
%dbg(EXECUTED AS) %{exec} %t.exe --benchmark_out=%T/benchmark-result.json --benchmark_out_format=json >/dev/null"
",
(Obviously /dev/null
needs to be a portable solution.)
Changing the output to a .txt
file feels wrong, then not all tests provide a json
file.
Another solution is to use /dev/null
unconditionally. Then add a lit parameter benchmark_out="csv|json|txt" to select the output format. By default you get json. If you want human readable you select txt. Since the suite supports csv, I think we should add it. The file will always get the same name + the appropriate extension.
IMO the cleanest solution is to always write the benchmark output to "/dev/null
" and add the lit parameter to select the output format. WDYT?
|
||
void vprint_string(std::string_view fmt, std::format_args args) { | ||
auto s = std::vformat(fmt, args); | ||
std::size_t result = fwrite(s.data(), 1, s.size(), stdout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::size_t result = fwrite(s.data(), 1, s.size(), stdout); | |
std::size_t result = std::fwrite(s.data(), 1, s.size(), stdout); |
Also below.
} | ||
|
||
void print_string(benchmark::State& s) { | ||
while (s.KeepRunning()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the preferred way of using GoogleBenchmark is to use a range-based for-loop: https://github.com/google/benchmark/blob/main/docs/user_guide.md#a-faster-keep-running-loop
This can be re-written as
void print_string(benchmark::State& s) {
for (auto _ : s) {
print_string("The answer to life, the universe, and everything is {}.\n", 42);
}
}
P3107R5 "Permit an efficient implementation of std::print" should improve the performance of std::print. Before implementing these improvements it adds the benchmark provided in the paper.
It replaces the
fmt::memory_buffer
withstd::vector<char>
, which is not a stack buffer. However there is not a good Standard type. (std::inplace_vector would be a better type.)