Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LTO support in CMake build system. #17027

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Ferroin
Copy link
Member

@Ferroin Ferroin commented Feb 16, 2024

Summary

Internally, CMake calls LTO ‘Interprocedural Optimization’, abbreviated as ‘IPO’, and it provides functionality for checking for support for this as well as enabling it by default for targets. This leverages that support to auto-detect LTO and enable it if it’s supported.

Also includes a new option (DISABLE_LTO) to explicitly disable LTO without checking for support for it.

Test Plan

Preliminary testing consists of just verifying that CI passes on this PR.

Further testing requires manually confirming the compiler flags used on a variety of platforms to ensure that LTO is enabled correctly.

@github-actions github-actions bot added the area/build Build system (autotools and cmake). label Feb 16, 2024
Copy link
Contributor

@vkalintiris vkalintiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I've added some non-required comments/asks inline.

However, LTO should be disabled by default.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated
# Handling of LTO/IPO
#

option(DISABLE_LTO "Disable use of LTO when building. Defaults to being enabled if supported." FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general tip, I prefer using ENABLE_XXX instead of DISABLE_XXX, because this avoids double negation in the code, eg. NOT DISABLE_XXX.

@Ferroin
Copy link
Member Author

Ferroin commented Feb 22, 2024

However, LTO should be disabled by default.

I was under the impression that we wanted it by default, at least for release builds, because we had determined that it was a significant performance improvement for our code (hence why we had it by default in the old build system).

@vkalintiris
Copy link
Contributor

I was under the impression that we wanted it by default, at least for release builds.

Correct. However, LTO is quite heavy on build times. We can either add this CMake option explicitly whenever we call cmake (ie. on release jobs), or add a check to see if the CMAKE_BUILD_TYPE is Debug.

@Ferroin
Copy link
Member Author

Ferroin commented Feb 22, 2024

I was under the impression that we wanted it by default, at least for release builds.

Correct. However, LTO is quite heavy on build times. We can either add this CMake option explicitly whenever we call cmake (ie. on release jobs), or add a check to see if the CMAKE_BUILD_TYPE is Debug.

OK, so the goal then is to just allow for fast debug builds. That’s doable without much issue.

@Ferroin
Copy link
Member Author

Ferroin commented Feb 22, 2024

Rebased to pick up the latest changes in the master branch.

@github-actions github-actions bot added the area/packaging Packaging and operating systems support label Feb 22, 2024
@Ferroin
Copy link
Member Author

Ferroin commented Feb 23, 2024

@vkalintiris I initially switched things to ENABLE_LTO as you requested, but I’ve decided to instead switch back to the option name being DISABLE_LTO and just inverting the case ordering so that we don’t have a NOT DISABLE_LTO case. Full explanation in the commit message, but the short version is that ENABLE_LTO would not behave like our other ENABLE_* options, and thus may cause confusion.

@Ferroin
Copy link
Member Author

Ferroin commented Feb 28, 2024

Rebased to resolve merge conflicts.

@Ferroin
Copy link
Member Author

Ferroin commented Mar 7, 2024

Rebased to pick up fixes for CI in master.

@Ferroin
Copy link
Member Author

Ferroin commented Mar 22, 2024

Rebased to resolve merge conflicts.

@Ferroin
Copy link
Member Author

Ferroin commented Apr 4, 2024

Rebased to pick up latest changes in master branch.

@ilyam8

This comment was marked as resolved.

@Ferroin
Copy link
Member Author

Ferroin commented Apr 5, 2024

Master vs this branch (./netdata-installer.sh ^Cinstall-prefix /opt):
test results

Edit: Not sure tests were correct, need to recheck.

My own testing with with no external components vendored, I suspect that’s what’s different here. I’ll need to look deeper into this.

I can 100% confirm though that enabling both LTO (with -flto -fno-lto-fat-objects) and -fdata-sections breaks the build on openSUSE with the build system as it is currently. It ends up choking on the first global variable it finds in the link order within the agent code itself (it seems to be fine with any vendored components).

@Ferroin
Copy link
Member Author

Ferroin commented Apr 5, 2024

Based on further investigation and discussion with @ilyam8:

  • The differences he was seeing were due to issues in how he was testing.
  • The functionally minimal difference that I was seeing in my initial testing of 9007816 was a result of the fact that I was not using any vendored libraries, and therefore there was almost zero dead code eliminated by section garbage collection during linking.
  • When forcing usage of vendored copies of libyaml and protobuf and testing in otherwise identical conditions (Ubuntu 22.04 on 64-bit x86, GCC, Ninja, most external plugins and the Prometheus Remote Write exporter disabled) to what I was testing with originally, the difference amounts to approximately 4.1 kiB in the main binary with and without 9007816
  • When testing with static builds, the difference in size for the main binary between this PR and the master branch is roughly 6 MiB, but there is no difference when testing the PR whether 9007816 is included or not, which suggests to me that the difference is entirely a result of using LTO.

Internally, CMake calls LTO ‘Interprocedural Optimization’, and it
provides functionality for checking for support for this as well as
enabling it by default for targets. This leverages that support to
auto-detect LTO and enable it if it’s supported.
Using `ENABLE_LTO` leads to a possibility for confusion among users, as it
does behave in the most intuitive manner. Instead of ensuring that LTO
is used (and thus behaving like every other `ENABLE_*` option we have),
it allows the usage LTO if it’s supported. Thus a build with
`-DENABLE_LTO=True` may not actually be built with LTO.

By instead using `DISABLE_LTO`, the behavior matches up directly with
how most people are likely to interpret the meaning, because a build
with `-DDISABLE_LTO=True` will _never_ have LTO flags added to the
compiler/linker flags.
On pretty much all RPM platforms, the RPM build process itself will
correctly add the required compiler flags when building, so we don’t
actually need to auto-detect LTO support in CMake here.

Additionally, on at least some RPM platforms, CMake’s auto-detection
for LTO support actually breaks the build when used.
On at least some systems, `-fdata-sections` combined with LTO reliably
causes failures at link time with our code.

The final binary size on systems where the combination _works_ differs by
no more than a few KiB on average (tested on 64-bit x86 on Ubuntu 22.04,
Debian 12, Fedora 39, and Rocky Linux 9), so we’re not actually getting
almost any benefit out of using both with things as they are now, but
LTO gives us a meaasurable performance improvement that per-function and
per-data sections do not.
@Ferroin
Copy link
Member Author

Ferroin commented Jun 7, 2024

Rebased to resolve merge conflicts and pick up the latest changes in the master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build system (autotools and cmake). area/packaging Packaging and operating systems support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants