Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary linking with libprotobuf.a got segmentation fault on CentOS 7 #1113

Closed
keyihao opened this issue Sep 26, 2023 · 15 comments
Closed

Binary linking with libprotobuf.a got segmentation fault on CentOS 7 #1113

keyihao opened this issue Sep 26, 2023 · 15 comments

Comments

@keyihao
Copy link

keyihao commented Sep 26, 2023

The following test program crashes (segmentation fault) when building and running on CentOS 7 x86_64 (with glibc 2.17, gcc 4.8.5 and mold 2.2.0)

#include <google/protobuf/stubs/statusor.h>
#include <google/protobuf/util/internal/datapiece.h>

using namespace google::protobuf::util::converter;
using namespace google::protobuf::util;

int main(int argc, char** argv) {
  uint64_t from = 100;
  DataPiece data(from);
  StatusOr<uint32_t> u32 = data.ToUint32();
  if (u32.ok()) {
    printf("convert ok, value %u\n", u32.ValueOrDie());
  }
  return 0;
}

Protobuf version is 3.5.1 and the static library is prebuilt on the same system.

Binary built command

#!/bin/bash
# assume that ld.mold has been installed into /usr/bin
mv /usr/bin/ld.gold /usr/bin/ld.gold.bak
ln -s /usr/bin/ld.mold /usr/bin/ld.gold

g++ -std=c++11 -isystem protobuf/src -c ./test_datapiece.cpp -o ./test_datapiece_mold.o
g++ -o ./test_datapiece_mold -static-libgcc -static-libstdc++ -B/usr/bin -fuse-ld=gold -Wl,--repro test_datapiece_mold.o ./protobuf/lib64/libprotobuf.a

rm /usr/bin/ld.gold
mv /usr/bin/ld.gold.bak /usr/bin/ld.gold

The binary works as expected when linking with gold, but got segmentation fault when linking with mold.

The backtraces are listed below

(gdb) bt
#0  0x000000000038f15c in std::string::assign(std::string const&) ()
#1  0x0000000000271416 in operator= (__str=..., this=<optimized out>) at /usr/local/include/c++/4.8.2/bits/basic_string.h:547
#2  google::protobuf::util::Status::operator= (this=this@entry=0x7fffffffe320, other=...) at google/protobuf/stubs/status.cc:105
#3  0x00000000003379da in StatusOr (value=<synthetic pointer>, this=0x7fffffffe320) at ./google/protobuf/stubs/statusor.h:207
#4  ValidateNumberConversion<unsigned int, unsigned long> (before=100, after=100) at google/protobuf/util/internal/datapiece.cc:65
#5  NumberConvertAndCheck<unsigned int, unsigned long> (before=100) at google/protobuf/util/internal/datapiece.cc:83
#6  google::protobuf::util::converter::DataPiece::GenericConvert<unsigned int> (this=<optimized out>) at google/protobuf/util/internal/datapiece.cc:327
#7  0x000000000033552b in google::protobuf::util::converter::DataPiece::ToUint32 (this=<optimized out>) at google/protobuf/util/internal/datapiece.cc:141
#8  0x000000000026e25b in main ()

The repro tar file(had to zip it because GitHub wouldn't allow to upload a tarball), source code and protobuf library are uploaded.
test_datapiece_mold.repro.tar.zip
mold_case.zip

@keyihao
Copy link
Author

keyihao commented Sep 26, 2023

cc @zonyitoo

@rui314
Copy link
Owner

rui314 commented Sep 29, 2023

It took a little bit of time to debug this thing, but it looks like your executable contains both .init_array and .ctors sections. Both serve the same purpose (they contain pointers to initializer functions), and .ctors has been superceded by .init_array. CentOS 7 was released 2014, and looks like your compiler is very old. That's actually too old that we don't support.

LLVM lld failed to produce a working executable, too.

I think we should make a change to mold so that it prints out a warning message if the output contains both .ctors and .init_array to inform the user that the output file may not work. Beyond that, I think I can only recommend upgrading to a recent version of the OS (it's EOL is very close now), sorry.

@keyihao
Copy link
Author

keyihao commented Oct 1, 2023

Thanks for your debug.
With gcc 7.5.0 and gcc 10.0, we can reproduce the same result. So it is related to the OS version?
I will try to reproduce on CentOS Stream this week.

@rui314
Copy link
Owner

rui314 commented Oct 1, 2023

Did you rebuild libprotobuf.a with the new compiler?

@keyihao
Copy link
Author

keyihao commented Oct 1, 2023

I rebuild libprotobuf.a with gcc 7.5, then it runs well. If only building the test program with gcc 7.5 or 10.0, but with old version libprotobuf.a, it still has segmentation fault.
So gcc 4.8.5 has some flaws, but ld and gold happens to be compatible with its output?

@keyihao
Copy link
Author

keyihao commented Oct 1, 2023

BTW, if mold can print warning under this circumstance, it would be great to know the result better.
If using gcc with -Werror -Wall -Wextra, could it treat this warning as error?

@rui314
Copy link
Owner

rui314 commented Oct 1, 2023

There's no flaw in gcc 4.8.5, that's just too old. The old compiler produces .ctors and .dtors sections instead of the modern .init_array and .fini_array sections. GNU ld and gold have a special logic to translate .ctors to .init_array and .dtors to .fini_array while lld and mold don't do that.

To turn warnings into errors, pass -Wl,--fatal-warnings at the link command line.

@keyihao
Copy link
Author

keyihao commented Oct 5, 2023

As for the warning message if the output contains both .ctors and .init_array, when could we expect using this new feature?

@rui314
Copy link
Owner

rui314 commented Oct 5, 2023

We could detect and address the situation by translating .ctors to .init_array just as the GNU linkers do. However, I'm not sure we want to implement this feature now and commit to its long-term maintenance, especially since it's most useful for systems nearing their EOL. It's simpler to advise users to continue using the old toolchain on their existing old systems. If you're keen on this feature and are willing to pay the implementation fee, we'd be happy to consider it. Otherwise, given our current priorities, this would be considered low priority. I apologize for any inconvenience.

@keyihao
Copy link
Author

keyihao commented Oct 9, 2023

We are planning to rebuild static libraries with newer gcc version to avoid having .ctors section. However, our existing code base is huge and a lot of libraries need to be rebuilt. So just the detecting of this situation is fine to catch potential segmentation faults.
I will submit a PR of detecting both .ctors and .init_array existing later and please help review it.

@rui314
Copy link
Owner

rui314 commented Oct 9, 2023

No need to send me a PR. Let me do that got you.

@rui314 rui314 closed this as completed in 3f88964 Nov 9, 2023
@rui314
Copy link
Owner

rui314 commented Nov 10, 2023

I've implemented a feature to translate .ctors/.dtors to .init_array/.fini_array, so that mold would work in your environment. Can you try again with the git head?

@keyihao
Copy link
Author

keyihao commented Nov 10, 2023

Thanks for supporting this, I will have a try

@keyihao
Copy link
Author

keyihao commented Nov 10, 2023

With the git head version, it works well in my environment. Thanks again for supporting this feature.

@rui314
Copy link
Owner

rui314 commented Nov 10, 2023

Cool! Thank you for verifying.

VitalyAnkh pushed a commit to VitalyAnkh/mold that referenced this issue Dec 23, 2023
.init_array/.fini_array haven't completely replace .ctors/.dtors, so
we need to convert .ctors/.dtors to .init_array/.fini_array.

Fixes rui314#1113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants