Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibreOffice unit tests crash when linked with mold #201

Closed
jmglogow opened this issue Dec 26, 2021 · 2 comments
Closed

LibreOffice unit tests crash when linked with mold #201

jmglogow opened this issue Dec 26, 2021 · 2 comments

Comments

@jmglogow
Copy link
Contributor

So I was able to build LibreOffice (LO) with all unit tests after fixing the '?' glob handling. I've build mold_1.0.0-75-g04ad22d3 as a Debian package. And LO itself seems to run fine. Seems, because I can start it, open Writer etc., but a lot (all?) unit tests crash.

My LO is current master with some additional patches, but mainly https://gerrit.libreoffice.org/c/core/+/127493 to use mold with gcc. It's build in a Ubuntu 20.04 schroot, with updates-repo enabled (AKA libc6:amd64 - 2.31-0ubuntu9.2). If I just switch the linker to gold, and just rebuild the comphelper module (which includes the library and matching unit tests), the crash disappears, so it seems to be a linker problem, from all I can tell.

Example bt starts with (all frames in libtest_comphelper_parallelsort_test_mold.so.bt.txt):

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fdca80 in _dl_relocate_object (l=0x5555555760c0, scope=0x555555576428, reloc_mode=reloc_mode@entry=0, consider_profiling=consider_profiling@entry=0) at dl-reloc.c:227
227 dl-reloc.c: No such file or directory.
(gdb) bt
#0 0x00007ffff7fdca80 in _dl_relocate_object (l=0x5555555760c0, scope=0x555555576428, reloc_mode=reloc_mode@entry=0, consider_profiling=consider_profiling@entry=0) at dl-reloc.c:227
#1 0x00007ffff7fe4f5c in dl_open_worker (a=a@entry=0x7fffffff0e30) at dl-open.c:688

As the crash happens in some file called relocate, I thought to dump them for the broken and working library with:

$ readelf -Wr libtest_comphelper_parallelsort_test_mold.so | cut -f10 -d' ' | sort > libtest_comphelper_parallelsort_test_mold.so_rel

and I got a longer diff then expected with some __cxa / CXXABI symbols gone: relocations.diff.txt

What is also strange is the fact, that file tells me for both variants:

ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=..., with debug_info, not stripped

But ldd claims for mold that it's statically linked - WTF?

So my current guess is that something is wrong with the ELF tables / headers in the mold variant.

Link command gold:

ccache g++ -pthread -shared -Wl,-z,noexecstack -Wl,-z,origin '-Wl,-rpath,$ORIGIN/../Library' -Wl,-rpath-link,$I/program -fuse-ld=gold -Wl,-z,defs -fstack-protector-strong -Wl,-rpath-link,/lib:/usr/lib -Wl,-z,combreloc -Wl,--hash-style=gnu -Wl,-Bsymbolic-functions -L$W/LinkTarget/StaticLibrary -L$I/sdk/lib -L$I/program -L$I/program -L$W/LinkTarget/Library -Wl,--gdb-index $W/CxxObject/comphelper/qa/unit/parallelsorttest.o -Wl,--start-group -L$W/UnpackedTarball/cppunit/src/cppunit/.libs -lcppunit -Wl,--end-group -Wl,--no-as-needed -lcomphelper -luno_cppuhelpergcc3 -luno_cppu -luno_sal -ltllo -o $W/LinkTarget/CppunitTest/libtest_comphelper_parallelsort_test.so

Link command with mold as gold:

ccache g++ -pthread -B$S/gcc_linker -shared -Wl,-z,noexecstack -Wl,-z,origin '-Wl,-rpath,$ORIGIN/../Library' -Wl,-rpath-link,$I/program -B$S/gcc_linker -fuse-ld=lld -Wl,-z,defs -fstack-protector-strong -Wl,-rpath-link,/lib:/usr/lib -Wl,-z,combreloc -Wl,--hash-style=gnu -Wl,-Bsymbolic-functions -L$W/LinkTarget/StaticLibrary -L$I/sdk/lib -L$I/program -L$I/program -L$W/LinkTarget/Library -Wl,--gdb-index $W/CxxObject/comphelper/qa/unit/parallelsorttest.o -Wl,--start-group -L$W/UnpackedTarball/cppunit/src/cppunit/.libs -lcppunit -Wl,--end-group -Wl,--no-as-needed -lcomphelper -luno_cppuhelpergcc3 -luno_cppu -luno_sal -ltllo -o $W/LinkTarget/CppunitTest/libtest_comphelper_parallelsort_test.so

gz compressed binaries:

libtest_comphelper_parallelsort_test_gold.so.gz
libtest_comphelper_parallelsort_test_mold.so.gz

I hope this is enough info, so you don't need to build LO. I certainly don't have a simple test case. I'll happily provide more info as needed.

@rui314
Copy link
Owner

rui314 commented Dec 28, 2021

Thank you for your report! I tried to build LibreOffice on my machine and confirmed that the test crashes. Here is the stack trace.

(gdb) bt
#0  0x00007ffff7fdca80 in _dl_relocate_object (l=0x220d00, scope=0x221068, reloc_mode=reloc_mode@entry=0, consider_profiling=consider_profiling@entry=0) at dl-reloc.c:227
#1  0x00007ffff7fe4f5c in dl_open_worker (a=a@entry=0x7fffffffd120) at dl-open.c:688
#2  0x00007ffff7a548b8 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:208
#3  0x00007ffff7fe45fa in _dl_open (file=0x220790 "/home/ruiu/libreoffice/workdir/LinkTarget/CppunitTest.good/libtest_comphelper_parallelsort_test.so", mode=-2147483390, caller_dlopen=<optimized out>, nsid=-2, argc=8, argv=0x7fffffffdc48, env=0x7fffffffdc90) at dl-open.c:837
#4  0x00007ffff78ec34c in dlopen_doit (a=a@entry=0x7fffffffd340) at dlopen.c:66
#5  0x00007ffff7a548b8 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd2e0, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:208
#6  0x00007ffff7a54983 in __GI__dl_catch_error (objname=0x220c70, errstring=0x220c78, mallocedp=0x220c68, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:227
#7  0x00007ffff78ecb59 in _dlerror_run (operate=operate@entry=0x7ffff78ec2f0 <dlopen_doit>, args=args@entry=0x7fffffffd340) at dlerror.c:170
#8  0x00007ffff78ec3da in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#9  0x00007ffff7f9afe6 in CppUnit::DynamicLibraryManager::doLoadLibrary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/ruiu/libreoffice/workdir/UnpackedTarball/cppunit/src/cppunit/.libs/libcppunit-1.15.so.1
#10 0x00007ffff7f753af in CppUnit::DynamicLibraryManager::loadLibrary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/ruiu/libreoffice/workdir/UnpackedTarball/cppunit/src/cppunit/.libs/libcppunit-1.15.so.1
#11 0x00007ffff7f7534d in CppUnit::DynamicLibraryManager::DynamicLibraryManager(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/ruiu/libreoffice/workdir/UnpackedTarball/cppunit/src/cppunit/.libs/libcppunit-1.15.so.1
#12 0x00007ffff7f7c1ed in CppUnit::PlugInManager::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, CppUnit::PlugInParameters const&) () from /home/ruiu/libreoffice/workdir/UnpackedTarball/cppunit/src/cppunit/.libs/libcppunit-1.15.so.1
#13 0x0000000000205632 in (anonymous namespace)::ProtectedFixtureFunctor::run() const ()
#14 0x0000000000204c12 in main ()

I'll investigate and fix the issue.

rui314 added a commit that referenced this issue Dec 28, 2021
This reverts commit 5c35d2a
because it causes a regression #201.

I'll fix #126 in another patch.
@rui314
Copy link
Owner

rui314 commented Dec 28, 2021

It looks like this is a regression caused by 5c35d2a. I reverted it, so it should be fine now.

@rui314 rui314 closed this as completed Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants