Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LTO support #290

Merged
merged 15 commits into from
Jul 23, 2020
Merged

Add LTO support #290

merged 15 commits into from
Jul 23, 2020

Conversation

shibatch
Copy link
Owner

@shibatch shibatch commented Mar 19, 2020

With this patch, static libraries with LTO (with gcc) or ThinLTO (with clang) can be generated by specifying -DENABLE_LTO=TRUE option to cmake.

@shibatch
Copy link
Owner Author

@edelsohn @seiko2plus Feel free to review this PR.

Copy link
Collaborator

@fpetrogalli fpetrogalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shibatch ,

thank you for working on this.

Do we need interprocedural optimization? Shouldn't this be needed when we link a program against libsleef?

Francesco

@shibatch
Copy link
Owner Author

Yes. Optimization is carried out when the generated library is linked.
This is basically gcc version of the llvm bitcode generation, which @xoofx made.

Configure.cmake Outdated
@@ -305,7 +307,35 @@ if(CMAKE_C_COMPILER_ID MATCHES "(GNU|Clang)")
# src/arch/helpervecext.h:88
string(CONCAT FLAGS_WALL ${FLAGS_WALL} " -Wno-psabi")
set(FLAGS_ENABLE_NEON32 "-mfpu=neon")

if (ENABLE_LTO AND NOT BUILD_SHARED_LIBS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are CMAKE internals that automatically set up the right flags for the compiler to set up LTO: https://stackoverflow.com/questions/31355692/how-do-i-enable-link-time-optimization-lto-with-cmake

I think it would be better to use those instead of customizing the flags manually.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INTERPROCEDURAL_OPTIMIZATION property seems not working.
It just displays warnings saying that property is ignored.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you setting the property at command line invocation or checking it as in the example of the link?

cmake_minimum_required(VERSION 3.9.4)

include(CheckIPOSupported)
check_ipo_supported(RESULT supported OUTPUT error)

add_executable(example Example.cpp)

if( supported )
    message(STATUS "IPO / LTO enabled")
    set_property(TARGET example PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
else()
    message(STATUS "IPO / LTO not supported: <${error}>")
endif()

What version of cmake are you using?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake version is 3.10.2, which is the version that comes with Ubuntu bionic.
I just added set_property command.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this.

Compiler : gcc-8
-- The C compiler identification is GNU 8.4.0
-- Check for working C compiler: /usr/bin/gcc-8
-- Check for working C compiler: /usr/bin/gcc-8 -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.1.1")
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE - Success
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Success
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Success
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Success
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Success
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Success
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test COMPILER_SUPPORTS_OPENMP
-- Performing Test COMPILER_SUPPORTS_OPENMP - Success
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Success
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Success
-- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM
-- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM - Success
-- Unroll target for DP : unroll_0_vecextdp.c;unroll_2_vecextdp.c;unroll_0_sse2dp.c;unroll_2_sse2dp.c;unroll_0_avxdp.c;unroll_2_avxdp.c;unroll_0_avx2dp.c;unroll_2_avx2dp.c;unroll_0_avx512fdp.c;unroll_2_avx512fdp.c
-- Unroll target for SP : unroll_0_vecextsp.c;unroll_2_vecextsp.c;unroll_0_sse2sp.c;unroll_2_sse2sp.c;unroll_0_avxsp.c;unroll_2_avxsp.c;unroll_0_avx2sp.c;unroll_2_avx2sp.c;unroll_0_avx512fsp.c;unroll_2_avx512fsp.c
-- Configuring build for SLEEF-v3.5.0
   Target system: Linux-4.15.0-91-generic
   Target processor: x86_64
   Host system: Linux-4.15.0-91-generic
   Host processor: x86_64
   Detected C compiler: GNU @ /usr/bin/gcc-8
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math` to compile libsleef
-- Building shared libs : FALSE
-- MPFR : /usr/lib/x86_64-linux-gnu/libmpfr.so
-- MPFR header file in /usr/include
-- GMP : /usr/lib/x86_64-linux-gnu/libgmp.so
-- RT : /usr/lib/x86_64-linux-gnu/librt.so
-- FFTW3 : /usr/lib/x86_64-linux-gnu/libfftw3.so
-- OPENSSL : 1.1.1
-- SDE : SDE_COMMAND-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
-- A version of SLEEF compatible  with libm and libmvec in GNU libc will be produced (sleefgnuabi.so)
-- Configuring done
CMake Warning (dev) at src/libm/CMakeLists.txt:401 (add_library):
  Policy CMP0069 is not set: INTERPROCEDURAL_OPTIMIZATION is enforced when
  enabled.  Run "cmake --help-policy CMP0069" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.

  INTERPROCEDURAL_OPTIMIZATION property will be ignored for target
  'sleefgnuabiavx512fsp'.
This warning is for project developers.  Use -Wno-dev to suppress it.

Copy link
Collaborator

@fpetrogalli fpetrogalli Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not enough, you need to use the following:

include(CheckIPOSupported)

Have you checked cmake --help-policy CMP0069? It says the following:

CMake 3.9 and newer prefer to add IPO flags whenever the
``INTERPROCEDURAL_OPTIMIZATION`` target property is enabled and
produce an error if flags are not known to CMake for the current compiler.
Since a given compiler may not support IPO flags in all environments in which
it is used, it is now the project's responsibility to use the
``CheckIPOSupported`` module to check for support before enabling the
``INTERPROCEDURAL_OPTIMIZATION`` target property.  This approach
allows a project to conditionally activate IPO when supported.  It also
allows an end user to set the ``CMAKE_INTERPROCEDURAL_OPTIMIZATION``
variable in an environment known to support IPO even if the project does
not enable the property.

It seems to me that you are unable to use that property unless you use CheckIPOSupported.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it actually requires CMP0069 to be set to NEW.
However, even if I set CMP0069, it still does not support clang.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did more testing with cmake with clang, and I still think that cmake support for ThinLTO with clang is not complete.
Is it better not to support clang ThinLTO at this time?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK for not supporting clang. If you do the GCC route only, does it work with CheckIPOSupported as in the example provided on stack overflow?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please see the patch.
As you see, the amount of extra code for supporting clang is not much.
I think it is not a bad decision to include support for clang.

@shibatch
Copy link
Owner Author

shibatch commented Apr 3, 2020

It still requires setting flags for the archiver and C compiler in order to support clang.

@dnbaker
Copy link

dnbaker commented Apr 3, 2020

On my machine (OSX), I encounter the following with gcc-9:

cc1: error: '-fno-fat-lto-objects' are supported only with linker plugin

It compiles fine with clang, but I use gcc on all my other machines.

@shibatch
Copy link
Owner Author

shibatch commented Apr 6, 2020

@dnbaker I think there is no way to support gcc on mac.

@shibatch shibatch merged commit 493d0c6 into master Jul 23, 2020
@shibatch shibatch deleted the Add_LTO_support branch July 23, 2020 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants