New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weak linking CMake module #47

Closed
wants to merge 8 commits into
base: master
from

Conversation

4 participants
@opadron
Member

opadron commented Jul 6, 2016

Adapted from vtk#1511.

Adds a new experimental CMake module: targetLinkLibrariesWithDynamicLookup that allows one to portably indicate that a target uses symbols from another library, but that the resolution of symbols should be delayed until runtime, i.e.: the linker should tolerate undefined symbols when linking. "Weak linking" has been used to refer to this code sharing scheme as well as its many subtle variants.

Reviewers, please note the major differences in this PR when compared to vtk#1511.

  • This version uses two instances of a sample project with try_compile to infer the level of support for dynamic symbol resolution. The first determines if the linker responds to the dynamic lookup flag currently present in linker on OS X. The second determines if the linker and run-time loader can properly coalesce symbols that have been duplicated across link boundaries, such as ld/Linux. The original version only included the first check.
  • In addition, each sample project is tested after compiling to ensure that the test program produces the expected output as well as compile successfully. The original used a much simpler test program that would only compile.
  • In the original version, no link is performed when the linker is found to not support the dynamic lookup flag. In general, a proper linking is still necessary on platforms like Linux, whose loader can coalesce duplicate symbols at run-time. In this version, the link is performed when such a loader has been detected.
  • The original version's system checks were performed immediately upon module load and then cached. This version runs the system checks and caches on demand.

Also, includes a fix for when testing the tower-of-babel sample project under anaconda.

@AppVeyorBot

This comment has been minimized.

AppVeyorBot commented Jul 6, 2016

set(run_output)
if(project_compiles)
execute_process(COMMAND "${test_project_bin_dir}/main"

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor
execute_process(COMMAND ${CMAKE_CROSSCOMPILING_EMULATOR} "${test_project_bin_dir}/main"
if( NOT DEFINED HAS_DYNAMIC_LOOKUP_hash
OR NOT "${HAS_DYNAMIC_LOOKUP_hash}" STREQUAL "${cmake_flags_hash}")
unset(HAS_DYNAMIC_LOOKUP)

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor
unset(HAS_DYNAMIC_LOOKUP CACHE)

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor

Should HAS_DYNAMIC_LOOKUP_hash be unset too ?

if( NOT DEFINED HAS_SYMBOL_DEDUPE_hash
OR NOT "${HAS_SYMBOL_DEDUPE_hash}" STREQUAL "${cmake_flags_hash}")
unset(HAS_SYMBOL_DEDUPE)

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor
unset(HAS_SYMBOL_DEDUPE CACHE)

Should HAS_SYMBOL_DEDUPE_hash be unset ?

function(_test_weak_link_project projectName testName)
set(options DYNAMIC_LOOKUP SYMBOL_DEDUPE)
cmake_parse_arguments(_args "${options}" "" "" ${ARGN})

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor
# Sanity check
if(_args_DYNAMIC_LOOKUP AND _args_SYMBOL_DEDUPE)
  message(FATAL_ERROR "Option DYNAMIC_LOOKUP and SYMBOL_DEDUPE are mutually exclusive")
endif()
"${PROJECT_BINARY_DIR}/CMakeTmp/${projectName}/build")
file(MAKE_DIRECTORY ${test_project_src_dir})
file(MAKE_DIRECTORY ${test_project_bin_dir})

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor

Could you include the description of the test project ?

May be by adapting/copying part of the description that is available here: https://github.com/opadron/weak-linking-demo ?

set(project_works 1)
set(run_output)
if(project_compiles)

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor
set(do_try_run ${project_compiles})
if(do_try_run AND CMAKE_CROSSCOMPILING)
  if(NOT DEFINED CMAKE_CROSSCOMPILING_EMULATOR)
    set(do_try_run FALSE)
  else()
    # Display a message here ?
    set(project_works TRUE)
  endif()
endif()
if(do_try_run)

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor

We should implement the same behaviour as try_run....
See https://cmake.org/cmake/help/git-next/command/try_run.html#behavior-when-cross-compiling

HAS_DYNAMIC_LOOKUP
DYNAMIC_LOOKUP)
set(HAS_DYNAMIC_LOOKUP ${has_dynamic_lookup}

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor

Should the cache var be named LINKER_SUPPORT_DYNAMIC_LOOKUP ? or LINKER_HAS_DYNAMIC_LOOKUP ?

HAS_SYMBOL_DEDUPE
SYMBOL_DEDUPE)
set(HAS_SYMBOL_DEDUPE ${has_symbol_dedupe}

This comment has been minimized.

@jcfr

jcfr Jul 6, 2016

Contributor

Should the cache var be named LINKER_SUPPORT_SYMBOL_DEDUPE ? Or LINKER_HAS_SYMBOL_DEDUPE

@blowekamp

This comment has been minimized.

Contributor

blowekamp commented Jul 6, 2016

In the original version, no link is performed when the linker is found to not
support the dynamic lookup flag. In general, a proper linking is still necessary
 on platforms like Linux, whose loader can coalesce duplicate symbols at 
run-time. In this version, the link is performed when such a loader has been 
detected.

The goal of the original cmake file (in SimpleITK) was NOT to link again libpython on Linux and OSX as per distutils and Pypa recommendations. This sounds like you are trying to still link against the library?

I am not sure what the goal of this is now...

endfunction()
function(target_link_libraries_with_dynamic_lookup target)

This comment has been minimized.

@blowekamp

blowekamp Jul 6, 2016

Contributor

This is the original logic:

if ( ${CMAKE_SYSTEM_NAME} MATCHES "Darwin" )
     set_target_properties( ${target} PROPERTIES LINK_FLAGS "-undefined dynamic_lookup" )
   elseif(VTK_UNDEFINED_SYMBOLS_ALLOWED)
     # linker allows undefined symbols, let's just not link
   else()
     target_link_libraries ( ${target} ${ARGN}  )
   endif()

This is doing something different now.

@blowekamp

This comment has been minimized.

Contributor

blowekamp commented Jul 6, 2016

I am also going to point out that having a single executable linked to two shared libraries has quite a different scope for symbol resolution than an executable dynamically loading at runtime two separate libraries.

@jcfr

This comment has been minimized.

@blowekamp

This comment has been minimized.

Contributor

blowekamp commented Jul 6, 2016

This I think is how I began down this issue by naively linking against the shared library:

Perhaps it would be work testing on the manylinux1 docker image too?

@opadron

This comment has been minimized.

Member

opadron commented Jul 6, 2016

The logic change was on purpose, but clearly, it was a mistake if everything I read is true.

# linker allows undefined symbols, let's just not link

So, my main confusion stems from the fact that I could not replicate this behavior on Linux -- the linker always refuses to leave the symbols undefined. Is there a linker flag I should be using?

I am also going to point out that having a single executable linked to two shared libraries has quite a different scope for symbol resolution than an executable dynamically loading at runtime two separate libraries.

This sounds right to me, but I'm not sure what the point of bringing this up is? Neither of those scenarios are being tested in the test project. Should they be? Besides this tolerance for undefined symbols that I have yet to observe, is there anything we should be checking for that we're not?

@blowekamp

This comment has been minimized.

Contributor

blowekamp commented Jul 6, 2016

What is your platform and what exactly are you trying to do to test if "undefined" symbols are allowed? The original CMake file seems to pass on all my linux (elf binaries) systems I have tests [1].

I may have been mistaken about what this CMake code and the test project was trying to do. And I don't know why the test is constructed this way as opposed to other permutations of executable, linked library and runtime loaded library.

I also don't know what problem or issue the HAS_SYMBOL_DEDUPE is trying to address.

I thought I was understanding the linker process pretty well now, but I am guess I have more to read.

[1] https://github.com/SimpleITK/SimpleITK/blob/master/CMake/sitkTargetLinkLibrariesWithDynamicLookup.cmake#L44-L64

@opadron

This comment has been minimized.

Member

opadron commented Jul 6, 2016

I haven't actually tried that particular test code, but I was more referring to building actual Python extension modules. When trying to build a simple extension module, the link fails with undefined symbol errors on Linux. That is- when I don't link to any libpython.

The test project in this PR is based on my experiments here. There is a shared lib, L; a module lib, M; and the executable, E. E is linked against L. E also dynamically loads M. And M, itself uses code from L, but the main question is whether M should be linked against L. This test was supposed to emulate a typical example of building a python extension module. As an aside, I haven't been able to build M without linking it against L on Linux, I run into the exact same problems as I did with actual Python extension modules.

I figure besides library version mismatches, the only other potential source of problems with linking for both the module and the executable is that it may result in code duplication. If the library had some shared state, we should be able to observe the effects of such duplication by inspecting the shared state from the perspectives of E and M, and then comparing. If there is code duplication, the counting routines from each would have their own counters instead of sharing. If the loader coalesces the duplicate symbols, then they would share, which is the desired behavior. This is what I check for under the "SYMBOL_DEDUPING" name.

I'm going to check to see if I can get your simpler test code to compile and link. If so, then there's clearly something about how linking on Linux works that I'm not privy to.

@blowekamp

This comment has been minimized.

Contributor

blowekamp commented Jul 6, 2016

Another specific case where not linking solved a problem was on OS X. With binaries of SimpleITK compiled against the system Python 2.7 with explicit linking, when these would create a load error when run on Anaconda or home brew. However, if the "-undefined dynamic_lookup" was used and not linked against libpython, then the binaries would work!

SimpleITK is not linking against libpython, on every linux system I have tested RH5,RH6,RH7, Cent OS5, Some Debian, Ubuntu 12, Ubuntu 14....

I'd be interested in a minimal example and the link line used. I'm thinking of starting tests with gcc command lines here.

@opadron

This comment has been minimized.

Member

opadron commented Jul 6, 2016

Well, I've finally gotten extension modules to build and link on Linux as you and others describe -- without linking.

I have no idea what I'm doing differently, but I'm now a believer. 😅

I want to try to incorporate this into my weak-linking-demo and see how that changes things. Then, I'll come back to this and make the necessary corrections.

Thanks for keeping me honest. 👍

@opadron opadron referenced this pull request Jul 8, 2016

Merged

improve weak link handling #3

@opadron

This comment has been minimized.

Member

opadron commented Jul 8, 2016

Check out opadron/weak-linking-demo#3 for an updated version of the module and an updated discussion in the README.

@AppVeyorBot

This comment has been minimized.

AppVeyorBot commented Jul 16, 2016

opadron added some commits Jul 16, 2016

@AppVeyorBot

This comment has been minimized.

AppVeyorBot commented Jul 16, 2016

@opadron

This comment has been minimized.

Member

opadron commented Jul 16, 2016

This should be ready to go.

@AppVeyorBot

This comment has been minimized.

AppVeyorBot commented Jul 16, 2016

endif()
file(APPEND "${test_project_src_dir}/main.c" "
result = count() != 0 ? 1 :

This comment has been minimized.

@jcfr

jcfr Jul 16, 2016

Contributor

1 -> EXIT_FAILURE

endif()
if(NOT DEFINED ${cache_var})
if(NOT CMAKE_CROSSCOMPILING OR CMAKE_CROSSCOMPILING_EMULATOR)

This comment has been minimized.

@jcfr

jcfr Jul 16, 2016

Contributor

if(NOT CMAKE_CROSSCOMPILING OR (CMAKE_CROSSCOMPILING AND CMAKE_CROSSCOMPILING_EMULATOR))

x
@opadron

This comment has been minimized.

Member

opadron commented Jul 16, 2016

continued in #51

@opadron opadron closed this Jul 16, 2016

@opadron opadron deleted the opadron:weak-linking-module branch Jul 16, 2016

jcfr added a commit to jcfr/weak-linking-demo that referenced this pull request Sep 25, 2016

targetLinkLibrariesWithDynamicLookup: Backport changes from scikit-build
This commit integrates changes associated with the module from
scikit-build/scikit-build@a415df9

The corresponding changes were originally added in scikit-build/scikit-build#47
and scikit-build/scikit-build@a606dc7 (handle missing target case)

IMPORTANT:

This backported commit causes all tests to *FAIL*. It has been kept
like this so that the next commit can clearly illustrate what is
required to fix them.

It also mean that the implementation in "scikit-build" is broken.


For example, on Linux, output is:

$ rm  -rf _build/ && dockcross-manylinux-x64 ./testAllBuilds.bash

CASE  CONFIG BUILD  TEST
s00   fail   fail   RTE
s01   fail   fail   RTE
s10   fail   fail   RTE
s11   pass   pass   pass
d00   fail   fail   RTE
d01   fail   fail   RTE
d10   fail   fail   RTE
d11   pass   pass   pass


whereas the following is expected:

CASE  CONFIG BUILD  TEST
s00   pass   pass   RTE
s01   pass   pass   pass
s10   pass   pass   RTE
s11   pass   pass   pass
d00   pass   pass   RTE
d01   pass   pass   pass
d10   pass   pass   RTE
d11   pass   pass   pass

jcfr added a commit that referenced this pull request Sep 25, 2016

Fix targetLinkLibrariesWithDynamicLookup
This commit has two purposes:

(1) it fixes logic checking for cross-compilation (the regression
was introduced by #51 and #47

(2) It configure the text project setting CMAKE_ENABLE_EXPORTS to ON. Doing
so ensure the executable compiled in the test exports symbols (if supported
by the underlying platform).
See https://cmake.org/cmake/help/v3.6/prop_tgt/ENABLE_EXPORTS.html?highlight=enable_export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment