Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamiccompile tests not successful on NixOS #2497

Closed
ThomasMader opened this issue Jan 13, 2018 · 20 comments
Closed

dynamiccompile tests not successful on NixOS #2497

ThomasMader opened this issue Jan 13, 2018 · 20 comments

Comments

@ThomasMader
Copy link
Contributor

I had similar problems with the asan tests in 1.5.0 and just removed them but with 1.7.0 those tests work. Instead have similar error messages for the dynamiccompile tests.

@JohanEngelen suggested that the dynamiccompile shared lib probably needs the same rpath logic as ASan in http://forum.dlang.org/post/nqbvuieabgedanbbqbnc@forum.dlang.org

The output of the build/test run:

CMake Warning (dev):
Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake
--help-policy CMP0042" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.

MACOSX_RPATH is not specified for the following targets:

ldc-jit-rt-so

Here is the error for array:

771: FAIL: LDC :: dynamiccompile/array.d (140 of 190)
771: ******************** TEST 'LDC :: dynamiccompile/array.d' FAILED ********************
771: Script:
771: --
771: /tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/build/bin/ldc2 -enable-dynamic-compile -run /private/tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/tests/dynamiccompile/array.d
771: --
771: Exit Code: 2
771:
771: Command Output (stdout):
771: --
771: $ "/tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/build/bin/ldc2" "-enable-dynamic-compile" "-run" "/private/tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/tests/dynamiccompile/array.d"
771: # command stderr:
771: dyld: Library not loaded: libldc-jit.77.dylib
771: Referenced from: /var/folders/rw/hkyl0vdn02jfvt1s8_2wxvkr000x9j/T/array-30ec308-4bc080
771: Reason: image not found
771: Error: /var/folders/rw/hkyl0vdn02jfvt1s8_2wxvkr000x9j/T/array-30ec308-4bc080 failed with status: -2
771: Error: message: Abort trap: 6
771: Error: program received signal 2 (Interrupt: 2)
771:
771: error: command failed with exit status: 2

...

771: Testing Time: 29.06s
771: ********************
771: Failing Tests (15):
771: LDC :: dynamiccompile/array.d
771: LDC :: dynamiccompile/calls.d
771: LDC :: dynamiccompile/classes.d
771: LDC :: dynamiccompile/dump_handler.d
771: LDC :: dynamiccompile/empty_jit_modules.d
771: LDC :: dynamiccompile/globals.d
771: LDC :: dynamiccompile/globals_types.d
771: LDC :: dynamiccompile/lambdas.d
771: LDC :: dynamiccompile/params_ctors.d
771: LDC :: dynamiccompile/recursive_call.d
771: LDC :: dynamiccompile/simple.d
771: LDC :: dynamiccompile/struct_init.d
771: LDC :: dynamiccompile/thread_local.d
771: LDC :: dynamiccompile/throw.d
771: LDC :: dynamiccompile/tls_workaround_opt.d
771:
771: Expected Passes : 145
771: Unsupported Tests : 30
771: Unexpected Failures: 15
1/1 Test #771: lit-tests ........................***Failed 30.71 sec

@JohanEngelen
Copy link
Member

@Hardcode84 Can you have a look at this?

@Hardcode84
Copy link
Contributor

@JohanEngelen I don't have access to OSX machine right now and I can only try to blindly duplicate asan logic.

@ThomasMader
Copy link
Contributor Author

@Hardcode84 I can run the dynamiccompile unittests on OSX or even do more if you tell me what to do.

@Hardcode84
Copy link
Contributor

@ThomasMader Thanks, I will prepare branch with fixes.

@Hardcode84
Copy link
Contributor

@ThomasMader #2503 Can you please try compile and test my branch https://github.com/Hardcode84/ldc/tree/jit_osx_link_fix

@ThomasMader
Copy link
Contributor Author

@Hardcode84 I saw that @JohanEngelen was faster. See #2503 (comment)

@kinke
Copy link
Member

kinke commented Apr 22, 2018

This should be working now.

@ThomasMader
Copy link
Contributor Author

This is still an issue for me with ldc 1.14.0-beta1 with Nix on MacOS.

Could someone explain to me why libldc-jit.77.dylib is not found?
It's supposed to be created by ldc right?

@kinke
Copy link
Member

kinke commented Jan 25, 2019

It's supposed to be created by ldc right?

It is; it's part of the all build target, see its CMake script. From our macOS CI job log:

[14/68] Linking CXX shared library lib/libldc-jit.2.0.84.dylib

@ThomasMader
Copy link
Contributor Author

ls /private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit* gives me:

/private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit-rt.a
/private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit.2.0.84.dylib
/private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit.84.dylib
/private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit.dylib

I still get the following error:

dyld: Library not loaded: libldc-jit.84.dylib
  Referenced from: /private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/lit_tmp_TkpNsw/tls_workaround_opt-7ab51fb-077bc5
  Reason: image not found

I read through https://blogs.oracle.com/dipol/dynamic-libraries,-rpath,-and-mac-os and thought it might be a good idea to make the otool checks but the referencing binary doesn't exist.
Is it automatically deleted by the lit test?
Is there a way to keep that file to be able to diagnose the problem?

Anyway here is the output of otool -D /private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit.84.dylib:

/private/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-2/ldc-1.14.0-beta1-src/build/lib/libldc-jit.84.dylib:
libldc-jit.84.dylib

@kinke
Copy link
Member

kinke commented Jan 26, 2019

Is it automatically deleted by the lit test?
Is there a way to keep that file to be able to diagnose the problem?

It's all in your (old) output:

/tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/build/bin/ldc2 -enable-dynamic-compile -run /private/tmp/nix-build-ldcBuild-1.7.0.drv-0/ldc-1.7.0-src/tests/dynamiccompile/array.d

Just remove the -run, as that removes the executable after execution, and then you can check the RPATH.

@ThomasMader
Copy link
Contributor Author

I found out that the test is working with the final package and noticed lines like this in the build process:

/nix/store/a8b8nq7a3knbbk3fz6jm467z0rph69hz-ldcBuild-1.14.0-beta1/lib/libldc-jit.2.0.84.dylib: fixing dylib

This led me to NixOS/nixpkgs@04fa8e0#diff-ae25cc0a6109438682d7b688b27a4b4aR23 and made me realize that it is necessary on Darwin with Nix that the dylibs are fixed.

I run the tests while the package is build and therefore the fixes are not done yet by Nix.

@ThomasMader
Copy link
Contributor Author

I was wrong about NixOS/nixpkgs@04fa8e0#diff-ae25cc0a6109438682d7b688b27a4b4aR23 .

The difference between the installed dylib and the dylib used inside the package build is that the installed has the full path as id.
This can be set with install_name_tool -id "<id>" <name>.dylib.

One solution to the problem is to set the DYLD_LIBRARY_PATH to the build/lib directory so the dylib can be found.
That could also be done inside the D source file: // RUN: env DYLD_LIBRARY_PATH=/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-5/ldc-1.14.0-beta1-src/build/lib/ %ldc -enable-dynamic-compile -run %s

As far as I understood it the id of the dylib is changed at the install step so I don't get why the dylib is found when built on a normal macOS machine.

@ThomasMader ThomasMader reopened this Jan 26, 2019
@kinke
Copy link
Member

kinke commented Jan 26, 2019

What's supposed to happen is that the lib is simply found by properly setting the executable's RPATH (specified in ldc2.conf, used automatically when linking with -enable-dynamic-compile and visible in the linker command line (add -v to ldc2 command line)). IIRC, the ID of the installed lib is @rpath/<name>.dylib (you could check that out by inspecting the lib of an official OSX package); not sure about the intermediate lib.

@ThomasMader
Copy link
Contributor Author

The problem is described also under https://nixos.org/nixpkgs/manual/#sec-darwin as:

On darwin libraries are linked using absolute paths, libraries are resolved by their install_name at link time.
...
Even if the libraries are linked using absolute paths and resolved via their install_name correctly, tests can sometimes fail to run binaries. This happens because the checkPhase runs before the libraries are installed.

This can usually be solved by running the tests after the installPhase or alternatively by using DYLD_LIBRARY_PATH. More information about this variable can be found in the dyld(1) manpage.

The DYLD_LIBRARY_PATH would solve the issue for me but I needed to put it in the D source file for lit to set the variable properly.
Maybe one solution would be to set the variable in tests/lit.site.cfg.in in case of Darwin?
The proposed solution to run the tests after the installPhase doesn't work as the lit tests don't use the installed package but the files from the build.

This is the outcome in the package build:

otool -L /var/folders/wq/m1dnr42s42n5msqk2v8l0lfc0000gn/T/array-cc7d415-e1a32f
/var/folders/wq/m1dnr42s42n5msqk2v8l0lfc0000gn/T/array-cc7d415-e1a32f:
	libldc-jit.84.dylib (compatibility version 84.0.0, current version 2.0.84)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)

And this is the outcome after the package is installed and the dylib is changed and works correctly:

otool -L /var/folders/wq/m1dnr42s42n5msqk2v8l0lfc0000gn/T/array-f82c628-91a4aa
/var/folders/wq/m1dnr42s42n5msqk2v8l0lfc0000gn/T/array-f82c628-91a4aa:
	/nix/store/69d8zvgcd2zmk3zaxs530045k9z4ig16-ldcBuild-1.14.0-beta1/lib/libldc-jit.84.dylib (compatibility version 84.0.0, current version 2.0.84)
	/nix/store/zk205w9w6fd77mv80iydcnp7c8s1him6-Libsystem-osx-10.11.6/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

@ThomasMader
Copy link
Contributor Author

I don't see a problem in the ldc2.conf files.

ldc2.conf for package build:

    // default rpath when linking against the shared default libs
    rpath = "/tmp/nix-build-ldcBuild-1.14.0-beta1.drv-5/ldc-1.14.0-beta1-src/build/lib";

ldc2.conf for installed package:

    // default rpath when linking against the shared default libs
    rpath = "/nix/store/69d8zvgcd2zmk3zaxs530045k9z4ig16-ldcBuild-1.14.0-beta1/lib";

@ThomasMader
Copy link
Contributor Author

You are right about the id for the normal release: @rpath/libldc-jit.84.dylib

Why is it libldc-jit.84.dylib in my package build? When should the @rpath be set?

@kinke
Copy link
Member

kinke commented Jan 27, 2019

Maybe one solution would be to set the variable in tests/lit.site.cfg.in in case of Darwin?

No way, this is apparently purely NixOS specific (as many other issues of yours were). It's working for normal macOS, Linux etc.

When should the @rpath be set?

As part of ninja/make install, due to the MACOSX_RPATH CMake target property being enabled.

Look, the simplest workaround to get those tests running is changing the ID of that lib manually before running the lit-tests. The most important thing is that the final (installed) lib works for your users. For portability (and the option for your users to redistribute that lib), it should ideally feature the @rpath/<name> ID, just like the one in our official release package; if NixOS somehow doesn't support that, then so be it, and you and your users will have to live with absolute paths.

@ThomasMader
Copy link
Contributor Author

I would like to just test the final binaries but that doesn't work as many of the tests are using the binaries from the build.
This makes for sure sense as those tests are made for upstream development.
But I like the idea of having my package tested against all available tests as it brings up problems very fast.

I am now fixing the IDs of the libs with a bash script. But before that I need to build all test runners with make -j$NIX_BUILD_CORES all-test-runners because this step is producing dylib libs too.

And now I am going a little off topic.

The funny thing though is that I am getting C++ linker errors when I try to test everything with ctest -V -j $NIX_BUILD_CORES --output-on-failure

1511: $ ":" "RUN: at line 3"
1511: $ "/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-34/ldc-1.14.0-beta1-src/build/bin/ldc2" "-conf=" "-I/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-34/ldc-1.14.0-beta1-src/tests/baremetal/inputs" "-run" "/private/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-34/ldc-1.14.0-beta1-src/tests/baremetal/classes.d"
1511: # command stderr:
1511: ld: library not found for -lc++

When I run them independently like in the semaphoreci sh script in the repo those errors vanish.
I don't have a clue what the cause of this could be.
Anyway, with the following sequence I am down to one missing test it seems.

        # Build default lib test runners
        make -j$NIX_BUILD_CORES all-test-runners
        ${fixNames} 

      # Build and run LDC D unittests. 
      ctest --output-on-failure -R "ldc2-unittest"
      # Run LIT testsuite.
      ctest -V -R "lit-tests"
    
      # Run DMD testsuite.
      DMD_TESTSUITE_MAKE_ARGS=-j$NIX_BUILD_CORES DMD=${ldcBuild.out}/bin/ldmd2 CC=$CXX ctest -V -R "dmd-testsuite"

      # Run default lib unittests
      ctest -j$NIX_BUILD_CORES --output-on-failure -E "ldc2-unittest|lit-tests|dmd-testsuite|druntime-test-shared"

The last remaining error is:

1500/1504 Test #1496: druntime-test-shared ...............................................................***Failed    2.07 sec
make: Entering directory '/private/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/runtime/druntime/test/shared'
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/bin/ldmd2 -fPIC -shared   -w -I../../src -I../../import -Isrc -defaultlib= -debuglib= -dip1000 -L/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib -link-defaultlib-shared -O -release -of/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/lib.so src/lib.d -L-ldl
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/bin/ldmd2   -w -I../../src -I../../import -Isrc -defaultlib= -debuglib= -dip1000 -L/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib -link-defaultlib-shared -O -release -of/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/link src/link.d -L/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/lib.so
Testing link
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/link
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/bin/ldmd2   -w -I../../src -I../../import -Isrc -defaultlib= -debuglib= -dip1000 -L/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib -link-defaultlib-shared -O -release -of/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/load src/load.d -L-ldl
Testing load
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/load
clang -Wall -Wl,-rpath,/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib -o /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/linkD src/linkD.c /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/lib.so -ldl  -pthread
Testing linkD
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/linkD
clang -Wall -Wl,-rpath,/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib -o /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/linkDR src/linkDR.c /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib -ldl  -pthread
Testing linkDR
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/linkDR
clang -Wall -Wl,-rpath,/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib -o /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/loadDR src/loadDR.c -ldl  -pthread
Testing loadDR
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/loadDR /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/bin/ldmd2   -w -I../../src -I../../import -Isrc -defaultlib= -debuglib= -dip1000 -L/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/lib/libdruntime-ldc-shared.2.0.84.dylib -link-defaultlib-shared -O -release -of/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/finalize src/finalize.d -L-ldl
Testing finalize
/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/finalize
make: *** [Makefile:24: /tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/build/runtime/druntime-test-shared/finalize.done] Illegal instruction: 4
make: Leaving directory '/private/tmp/nix-build-ldcUnittests-1.14.0-beta1.drv-37/ldc-1.14.0-beta1-src/runtime/druntime/test/shared'

@ThomasMader
Copy link
Contributor Author

Closing as the dynamiccompile tests work on MacOS now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants