Enable PGO for benchmarks #18

djkoloski · 2021-06-29T15:16:42Z

Enabling profile-guided optimization will provide some numbers that are the best they can be for each framework. It might be worth separating these out from the general numbers so users can get an idea of how much they stand to gain for their efforts.

djkoloski · 2021-06-29T15:19:08Z

Look at autofdo with perf record.

zamazan4ik · 2024-03-16T23:33:51Z

I do ongoing PGO research on different applications - all results are available at https://github.com/zamazan4ik/awesome-pgo . I performed some PGO benchmarks on the rust_serialization_benchmark too and want to share my results here.

Test environment

Fedora 39
Linux kernel 6.7.6
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.76
Repo version: master branch on commit ce821970017832f43d00bc5110462ff1a2c38e17
Disabled Turbo boost for improving consistency across runs

Benchmark

Release benchmarks are done with tasket -c 0 cargo +nightly bench, PGO training phase with tasket -c 0 cargo +nightly pgo bench, PGO optimization with tasket -c 0 cargo +nightly pgo optimize bench. taskset -c 0 is used for better benchmark consistency. All PGO-related routines are done with cargo-pgo. All benchmarks are done on the same machine, with the same hardware/software during runs, with the same background "noise" (as much as I can guarantee, of course).

Results

Here are the results:

Release: https://gist.github.com/zamazan4ik/292d71316488e3e09bcb56f4e499f42e
PGO-optimized: https://gist.github.com/zamazan4ik/5cfb7a8acbe37ca3e876ba1a2468f5d5
(just for reference) PGO instrumented: https://gist.github.com/zamazan4ik/84ff553619aae1faab7a40fa33944d86

At least in the provided by project benchmarks, there are measurable improvements in many cases. However, also there are some regressions.

Look at autofdo with perf record.

I recommend starting with the regular PGO via instrumentation. AutoFDO is used for sampling the PGO approach. Starting with the instrumentation is generally a better idea since it has wider platform support and can be easily enabled for the project (compared to the sampling-based PGO).

finnbear · 2024-03-21T05:39:44Z

Great job presenting the results of PGO! In general, it seems like PGO increases average performance by ~5% but introduces noise. Not necessarily from run to run, but from benchmark to benchmark and version to version. Might make more sense to average PGO results over all datasets (and just live with the fact that there are only 4, so the average isn't totally immune to noise). Could also average over all crates, and just report a single PGO result. The goal is to give users an accurate answer for 1) which crate 2) whether to try PGO.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable PGO for benchmarks #18

Enable PGO for benchmarks #18

djkoloski commented Jun 29, 2021

djkoloski commented Jun 29, 2021

zamazan4ik commented Mar 16, 2024

finnbear commented Mar 21, 2024

Enable PGO for benchmarks #18

Enable PGO for benchmarks #18

Comments

djkoloski commented Jun 29, 2021

djkoloski commented Jun 29, 2021

zamazan4ik commented Mar 16, 2024

Test environment

Benchmark

Results

finnbear commented Mar 21, 2024