Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand using Profile-Guided Optimization (PGO) and Post Link Optimization (PLO) in the Void Linux packages #47215

Closed
zamazan4ik opened this issue Nov 13, 2023 · 6 comments
Labels

Comments

@zamazan4ik
Copy link

Hello.

Now I am investigating PGO and PLO effects on different kinds of software - all my current results are available at https://github.com/zamazan4ik/awesome-pgo. According to these results, enabling PGO and PLO can help with achieving better overall performance in many cases. I think expanding PGO and PLO usage for Void Linux packages would be a good idea.

PGO is already a well-known technique. All currently known PGO effects on performance can be found at https://github.com/zamazan4ik/awesome-pgo#pgo-showcases . Several OS distros already enabled PGO for some packages like GCC, Rustc, Chromium, Firefox, and others (it depends on each OS distro, of course).

I think we can try to expand PGO usage across Void Linux packages. I think we can start by enabling PGO at least for the following projects:

  • Clang
  • GCC
  • Rust (rustc)
  • Firefox (however, I am not sure is it PGO-optimized right now or not in Void Linux)

I am sure it's possible to find more packages with upstream PGO support since some packages Void Linux packages like CPython already support building with PGO.

As an example, PGO is already used in other Linux distributions for these packages:

  • GCC in Solus: link
  • Clang in Solus: link
  • Firefox in Fedora: link
  • Rust in Fedora: link

Regarding Post Link Optimization (PLO), right now there are two main tools - LLVM BOLT and Google Propeller.

According to the Facebook Research Paper (https://research.facebook.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for various packages like compilers and interpreters. I think it would be a good idea to enable LLVM BOLT for some packages to deliver faster binaries for users (since Propeller is less stable right now).

Here I got some examples of how LLVM BOLT is already integrated into other projects:

So at least for the projects above LLVM BOLT effects are tested and some preparations are already done in the upstream projects. In this case, it should be easier to enable BOLT for these packages.

For some projects right now there is ongoing work on integrating LLVM BOLT into the build scripts:

More about LLVM BOLT performance results for other projects can be found in:

Some OS already using LLVM BOLT in their build scripts - check Clang recipe in Solus.

I don't create an issue per project (like "Enable BOLT for Clang", "Enable PGO for GCC", etc.) since I think first we need to discuss the approach. If we agree with enabling BOLT or expanding PGO usage for some packages, then we can create an additional issue (and use this issue as a meta issue). In this issue, we can discuss approaches regarding PGO and PLO.

@dmarto
Copy link
Contributor

dmarto commented Nov 15, 2023

On PGO for Firefox - #39652 (comment)

I would guess that extends to other large/heavy-to-build packages as well.

@zamazan4ik
Copy link
Author

On PGO for Firefox - #39652 (comment)

I would guess that extends to other large/heavy-to-build packages as well.

I saw this comment but I wanted to recheck that the situation with build resources hasn't changed during the last year.

@classabbyamp
Copy link
Member

the build servers have not changed in that time, afaik.

@ahesford
Copy link
Member

ahesford commented Dec 3, 2023

I suspect the benefits of PGO to be dubious for precompiled packages, because timing results that select between alternatives will be determined by the microarchitecture of the build servers, which will not generally match that of users.

@zamazan4ik
Copy link
Author

I suspect the benefits of PGO to be dubious for precompiled packages, because timing results that select between alternatives will be determined by the microarchitecture of the build servers, which will not generally match that of users.

PGO does not depend on the timings. Instrumentation PGO "just" collects counters from the instrumented program, so there is no actual difference if a program finishes the same workload in minutes or hours - the counter values would be the same for almost all applications. The only thing that can differ - is if a program has internal time-dependent logic (like "run this subroutine every N seconds"). In this case, the counters will differ. But from my experience, such software is really rare in real life and such differences do not introduce measurable differences in the optimization results.

Regarding (micro)architecture - PGO does not depend on the architecture features. However, generally is not recommended (check the answer for the fifth question) to re-use profiles between different architectures (e.g from x86-64 to ARM).

Copy link

github-actions bot commented Mar 2, 2024

Issues become stale 90 days after last activity and are closed 14 days after that. If this issue is still relevant bump it or assign it.

@github-actions github-actions bot added the Stale label Mar 2, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants