Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build[python]: avoid symbols conflicts with other loaded libprotobuf.so (Linux) #3899

Closed
alalek opened this issue Nov 16, 2017 · 1 comment
Closed

Comments

@alalek
Copy link

alalek commented Nov 16, 2017

There are several libraries/tools that uses protobuf via linking with system-wide libprotobuf.so.
Protobuf version mismatch between system protobuf binary and protobuf Python's extension (from PyPi) usually cause a fatal crashes.

Symbols conflict detection:

  • Fedora 26 with protobuf-devel package installed (protobuf version is 3.2.0).
  • Python virtualenv with installed protobuf package (pip install protobuf, protobuf version is 3.4.0)
  • try to simulate some library/tool which loads system protobuf binary:

$ python -c 'from ctypes import *; CDLL("/lib64/libprotobuf.so", mode=RTLD_GLOBAL); from google.protobuf import api_pb2'
Segmentation fault (core dumped)

  • LD_PRELOAD=/lib64/libprotobuf.so instead of CDLL works too

Crash stack trace shows protobuf symbols:

(gdb) bt
#0  std::_Hashtable<std::string, std::string, std::allocator<std::string>, std::__detail::_Identity, std::equal_to<std::string>, google::protobuf::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_deallocate_nodes (this=0x5555557b4178, __n=0x1)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/hashtable.h:763
#1  std::_Hashtable<std::string, std::string, std::allocator<std::string>, std::__detail::_Identity, std::equal_to<std::string>, google::protobuf::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::clear (this=0x5555557b4178)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/hashtable.h:1641
#2  0x00007fffeb038116 in std::unordered_set<std::string, google::protobuf::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> >::clear (this=<optimized out>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/unordered_set.h:472
#3  google::protobuf::DescriptorPool::FindFileByName (this=0x55555578d3b0, name="")
    at google/protobuf/descriptor.cc:1327
#4  0x00007fffeaff1f98 in google::protobuf::python::cdescriptor_pool::AddSerializedFile (self=0x7fffeb9cc1b8, serialized_pb=0x555555815e30)
    at google/protobuf/pyext/descriptor_pool.cc:510
#5  0x00007ffff7b12342 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0

But frames 0,1,2 are related to system libprotobuf.so (3.2.0, /opt/rh/ marker or "info source" gdb command) and frames 3,4 are from protobuf Python extension (3.4.0).
Sure, these functions are not binary compatible and they can't call each other.

Output of LD_DEBUG=files,symbols
      1379:	file=/home/alalek/penv/lib/python2.7/site-packages/google/protobuf/pyext/_message.so [0];  dynamically loaded by /lib64/libpython2.7.so.1.0 [0]
      1379:	file=/home/alalek/penv/lib/python2.7/site-packages/google/protobuf/pyext/_message.so [0];  generating link map
      1379:	  dynamic: 0x00007f796de7bba8  base: 0x00007f796da87000   size: 0x00000000003ff648
      1379:	    entry: 0x00007f796db1d190  phdr: 0x00007f796da87040  phnum:                  7
      1379:	
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=python [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libpython2.7.so.1.0 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libpthread.so.0 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libdl.so.2 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libutil.so.1 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libm.so.6 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libc.so.6 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libprotobuf.so [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libz.so.1 [0]
      1379:	symbol=_ZTVN10__cxxabiv120__si_class_type_infoE;  lookup in file=/lib64/libstdc++.so.6 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=python [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libpython2.7.so.1.0 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libpthread.so.0 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libdl.so.2 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libutil.so.1 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libm.so.6 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libc.so.6 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libprotobuf.so [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libz.so.1 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libstdc++.so.6 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libgcc_s.so.1 [0]
      1379:	symbol=_ZTSN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/home/alalek/penv/lib/python2.7/site-packages/google/protobuf/pyext/_message.so [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=python [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libpython2.7.so.1.0 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libpthread.so.0 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libdl.so.2 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libutil.so.1 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libm.so.6 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libc.so.6 [0]
      1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
<OOPS> 1379:	symbol=_ZTIN6google8protobuf18DescriptorDatabaseE;  lookup in file=/lib64/libprotobuf.so [0]
      1379:	symbol=_ZTIN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=python [0]
      1379:	symbol=_ZTIN6google8protobuf6python20PyDescriptorDatabaseE;  lookup in file=/lib64/libpython2.7.so.1.0 [0]
... there are many OOPS cases ...

Related SO question: https://stackoverflow.com/questions/7201667/ld-magically-overrides-statically-linked-symbols

Possible solution is to isolate protobuf symbols in Python extension (assume gcc+ld on Linux):

  1. link Python extensions with -Bsymbolic linker flag
  2. don't export protobuf symbols by using -fvisibility=hidden compiler flag. But this breaks .so build.
  3. don't export symbols from .a files: -exclude-libs=ALL linker option

There are many "magic" crashes related to conflicts of different protobuf versions, so lets try to resolve this issue.

Additional information:

  • similar problem can be observed on Ubuntu
  • Installing the same version of Python extension (pip install protobuf==3.2.0) doesn't help (still SIGSEGV)
@xfxyjwf
Copy link
Contributor

xfxyjwf commented Apr 9, 2018

I think we need a more realistic example that can lead to the crash. From what I can tell, few people load .so directly as shown in the above example and that isn't the use case we want to support either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants