Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic_cast form pointers is not working when linked with libc++_shared (ndk r15, r16b1) #519

Closed
andreya108 opened this issue Sep 13, 2017 · 30 comments
Labels

Comments

@andreya108
Copy link

@andreya108 andreya108 commented Sep 13, 2017

Description

Please help, I'm trying to figure out the reason, but still no luck. Maybe it is a bug.

Our project contains several .so which parts are building in different ways (for example, boost and other 3dp libs with standalone toolchains, main part which is bundled in aar with ndk-build and the app itself and jni part with gradle/cmake.

When I've switched to ndk16-b1 and libc++_shared all dynamic_cast's in c++ code from pointers to derived_class stored in std::list<base_class*> turned to nullptr.

For example:

class A {}
class B : A {}
std::list < A * > aList;
aList.add( new B() );
A* aPtr = aList.begin().get();
B* bPtr = dynamic_cast<B*>(aPtr);
// => bPtr = nullptr

This is only when libc++_shared is used.

I've tested with libc++_static, gnustl_shared & gnustl_static - the problem does not appear.
bPtr as expected is a pointer to object B added to list.

Any ideas?

Environment Details

  • NDK Version: 16.0.4293906-beta1
  • Build sytem: ndk-build + cmake + standalone toolchain
  • Host OS: Ubuntu 16.04
  • Compiler: clang c++14
  • ABI: arm64-v8a
  • STL: libc++_shared
  • NDK API level: 21
  • Device API level: 26
@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 13, 2017

Just tested: the same behavior when compiled with r15c.

@andreya108 andreya108 changed the title dynamic_cast is not working when linked with libc++_shared (ndk16b1) dynamic_cast is not working when linked with libc++_shared (ndk r15, r16b1) Sep 13, 2017
@DanAlbert DanAlbert self-assigned this Sep 13, 2017
@andreya108 andreya108 changed the title dynamic_cast is not working when linked with libc++_shared (ndk r15, r16b1) dynamic_cast form pointers from std::list is not working when linked with libc++_shared (ndk r15, r16b1) Sep 13, 2017
@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 13, 2017

Sorry, at first I've looked at another part of code and wrote about smart pointers, but it is std::list. Description is fixed now.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Sep 13, 2017

Your test case is not valid C++. Could you upload a test case?

@DanAlbert DanAlbert removed their assignment Sep 13, 2017
@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 15, 2017

I'm trying to reproduce the problem outside our environment...

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 18, 2017

I still have no luck reproducing the issue outside our environment.
Everything looks like if -fno-rtti is enabled.

Here is clang invocation command:

/usr/bin/ccache /home/andrey/android-ndk-r16-beta1/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ 
-MMD -MP -MF ./obj/local/arm64-v8a/objs-debug/src/file.o.d -gcc-toolchain /home/andrey/android-ndk-r16-beta1/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64 
-target aarch64-none-linux-android -ffunction-sections -funwind-tables -fstack-protector-strong 
-fpic -Wno-invalid-command-line-argument -Wno-unused-command-line-argument 
-no-canonical-prefixes  -g -fno-exceptions -fno-rtti -O0 -UNDEBUG -fno-limit-debug-info  
-I/home/andrey/android-ndk-r16-beta1/sources/cxx-stl/llvm-libc++/include 
-I/home/andrey/android-ndk-r16-beta1/sources/cxx-stl/llvm-libc++/../llvm-libc++abi/include 
-I/home/andrey/android-ndk-r16-beta1/sources/android/support/include -std=c++11 
-DUSE_ANDROID_UNIFIED_HEADERS -fcolor-diagnostics -frtti -fexceptions -femulated-tls 
-std=c++14 -fstack-protector-strong -DANDROID_NDK_VERSION=16 -Werror=return-type 
-ffunction-sections -fdata-sections -fvisibility=hidden -D_LIBCPP_HAS_NO_OFF_T_FUNCTIONS 
-fcolor-diagnostics -Wno-deprecated-register -Wno-inline-new-delete 
-Wno-invalid-source-encoding -Wno-unused-value -Wno-parentheses -Wno-deprecated  
-DANDROID  -D__ANDROID_API__=21 -Wa,--noexecstack -Wformat -Werror=format-security 
-O -g -DNDEBUG  --sysroot /home/andrey/android-ndk-r16-beta1/sysroot 
-isystem /home/andrey/android-ndk-r16-beta1/sysroot/usr/include/aarch64-linux-android 
-c  /mnt/ssd2/Development/file.cpp -o ./obj/local/arm64-v8a/objs-debug/file.o

Can it be configured without flags overriding? I mean it inserts some default options like -fno-rtti and -std=c++11 and then they are overrided in Application.mk in APP_CPPFLAGS.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Sep 18, 2017

For RTTI flags, the last one wins, so -frtti is in effect here. As long as you have -frtti in APP_CPPFLAGS it should be working.

If you are trying to cast something from a type in a library that's a prebuilt that was not built with RTTI, maybe that's the problem, but I don't think that's the case based on the command line above.

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 18, 2017

Yes, it is definitely compiled with -frtti.

Btw, can rtti info be stripped somehow? We use -Wl,-gc-section and -Wl,
--version-script to strip all dead code and hide extra symbols.

But anyway, I my testcase I also used that options and it still cannot be reproduced.

But in debugger in production code (I can attach screenshot) code looks like:

A* a = * iter; // iter is std::list<A*>::iterator
B* b = dynamic_cast<B*>( a );

debugger shows that a is a pointer to class derived from B (B derived from A), but b anyway becomes NULL. I have no idea yet...

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 18, 2017

How I can ensure that a shared library really contains rtti?

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Sep 26, 2017

We use -Wl,-gc-section and -Wl,--version-script to strip all dead code and hide extra symbols.

I'm pretty sure that RTTI across libraries does require exposing the RTTI data (which is no different from any other symbol), so if you're not exposing that in your version script, that's probably your issue.

How I can ensure that a shared library really contains rtti?

$ readelf -sW libfoo.so | c++filt | grep typeinfo
@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 28, 2017

It was really stripped.
But I've completely removed any stripping and hiding (no more version-script a and -gc-sections, all symbols are exported)... and nothing changed. dynamic_cast is still not working.
I can't get why it is ok with gnustl_shared.

How can I compare dynamic_cast implementations within gnustl & libc++? Where is actual for NDK source code located?

Another idea:
Here is a linking scheme in out project:

class StreamSocket : public NonBlockSocket -> libfoo.a (static)
libfoo.a + ...x... -> lib1.so
libfoo.a + ...y... -> lib2.so (2 shared libs both linked with the same static library)

java: System.loadLibrary( "c++_shared")
java: System.loadLibrary( "jniproxy")
libjniproxy.so -> dlopen ( lib1.so ) then dlopen( lib2.so )

It's known that dynamic_cast<StreamSocket*>() fails in lib2.so which is loaded after lib1.so
Can somehow rtti of lib1.so interfere with the same of lib2.so?

There are no dependencies between lib1.so and lib2.so and no C++ object of that type are passed. All communication is done via libjniproxy.so which is not aware about libfoo and its content.

andrey:~/build/libs/arm64-v8a$ readelf -sW lib1.so | c++filt | grep typeinfo | grep Socket
  2414: 000000000038cc20    17 OBJECT  GLOBAL DEFAULT   11 typeinfo name for NonBlockSocket
  3698: 00000000004b59b0    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for StreamSocket
  9515: 00000000004b58b8    16 OBJECT  GLOBAL DEFAULT   18 typeinfo for NonBlockSocket
 10173: 000000000038d324    15 OBJECT  GLOBAL DEFAULT   11 typeinfo name for StreamSocket

andrey:~/build/libs/arm64-v8a$ readelf -sW lib1.so | c++filt | grep vtable | grep Socket
   444: 00000000004b58c8   224 OBJECT  GLOBAL DEFAULT   18 vtable for StreamSocket
  6533: 00000000004b5818   160 OBJECT  GLOBAL DEFAULT   18 vtable for NonBlockSocket

andrey:~/build/libs/arm64-v8a$ readelf -sW lib2.so | c++filt | grep typeinfo | grep Socket
  5015: 0000000000d19d88    16 OBJECT  GLOBAL DEFAULT   18 typeinfo for NonBlockSocket
 17807: 00000000009f5580    17 OBJECT  GLOBAL DEFAULT   11 typeinfo name for NonBlockSocket
 19740: 00000000009f5c68    15 OBJECT  GLOBAL DEFAULT   11 typeinfo name for StreamSocket
 20106: 0000000000d19e80    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for StreamSocket

andrey:~/build/libs/arm64-v8a$ readelf -sW lib2.so | c++filt | grep vtable | grep Socket
  3601: 0000000000d19d98   224 OBJECT  GLOBAL DEFAULT   18 vtable for StreamSocket
 13639: 0000000000d19ce8   160 OBJECT  GLOBAL DEFAULT   18 vtable for NonBlockSocket

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Sep 28, 2017

I'm trying to reproduce this in test case and foo* symbols' typeinfo get LOCAL attribute instead of GLOBAL as in main project:

with -fvisibility=hidden

   207: 0000000000016c70    24 OBJECT  LOCAL  DEFAULT   18 typeinfo for B
   214: 000000000000504c     3 OBJECT  LOCAL  DEFAULT   11 typeinfo name for A
   216: 00000000000050dc     3 OBJECT  LOCAL  DEFAULT   11 typeinfo name for C
   225: 0000000000016c00    32 OBJECT  LOCAL  DEFAULT   18 typeinfo for A*
   232: 00000000000050d0     4 OBJECT  LOCAL  DEFAULT   11 typeinfo name for B*
   240: 0000000000005050    57 OBJECT  LOCAL  DEFAULT   11 typeinfo name for std::__ndk1::__shared_ptr_emplace<C, std::__ndk1::allocator<C> >
   254: 0000000000016b10    16 OBJECT  LOCAL  DEFAULT   18 typeinfo for A
   255: 0000000000016cb0    24 OBJECT  LOCAL  DEFAULT   18 typeinfo for C
   265: 00000000000050d8     3 OBJECT  LOCAL  DEFAULT   11 typeinfo name for B
   278: 0000000000016c20    32 OBJECT  LOCAL  DEFAULT   18 typeinfo for B*
   285: 0000000000016b60    24 OBJECT  LOCAL  DEFAULT   18 typeinfo for std::__ndk1::__shared_ptr_emplace<C, std::__ndk1::allocator<C> >
   289: 00000000000050cc     4 OBJECT  LOCAL  DEFAULT   11 typeinfo name for A*

without -fvisibility=hidden

    26: 0000000000019a30    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for B
    37: 0000000000007f0c     3 OBJECT  WEAK   DEFAULT   11 typeinfo name for A
    39: 0000000000007f9c     3 OBJECT  GLOBAL DEFAULT   11 typeinfo name for C
    53: 00000000000199c0    32 OBJECT  WEAK   DEFAULT   18 typeinfo for A*
    61: 0000000000007f90     4 OBJECT  WEAK   DEFAULT   11 typeinfo name for B*
    87: 00000000000198d0    16 OBJECT  WEAK   DEFAULT   18 typeinfo for A
    89: 0000000000019a70    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for C
   104: 0000000000007f98     3 OBJECT  GLOBAL DEFAULT   11 typeinfo name for B
   122: 00000000000199e0    32 OBJECT  WEAK   DEFAULT   18 typeinfo for B*
   136: 0000000000007f8c     4 OBJECT  WEAK   DEFAULT   11 typeinfo name for A*
   220: 0000000000019a30    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for B
   231: 0000000000007f0c     3 OBJECT  WEAK   DEFAULT   11 typeinfo name for A
   233: 0000000000007f9c     3 OBJECT  GLOBAL DEFAULT   11 typeinfo name for C
   247: 00000000000199c0    32 OBJECT  WEAK   DEFAULT   18 typeinfo for A*
   255: 0000000000007f90     4 OBJECT  WEAK   DEFAULT   11 typeinfo name for B*
   281: 00000000000198d0    16 OBJECT  WEAK   DEFAULT   18 typeinfo for A
   283: 0000000000019a70    24 OBJECT  GLOBAL DEFAULT   18 typeinfo for C
   298: 0000000000007f98     3 OBJECT  GLOBAL DEFAULT   11 typeinfo name for B
   316: 00000000000199e0    32 OBJECT  WEAK   DEFAULT   18 typeinfo for B*
   330: 0000000000007f8c     4 OBJECT  WEAK   DEFAULT   11 typeinfo name for A*

and there is not typeinfo for class pointers in main project...

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Oct 5, 2017

Agreed that this looks like #533. Once I get the fix for that submitted, you should check your app against a canary build.

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Oct 6, 2017

I've tested with both:

NDK r17 Canary Build 4380476 2017 Oct 6 05:27:34
and
NDK r16 Canary Build 4380053 2017 Oct 6 01:32:41

Still no luck. Maybe your fix is not there yet? So, looking forward to the next build.

Will it be available in r16 or should I check only r17 canary?

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Oct 6, 2017

Not in r16 yet, but it was in build 4380016 of r17 from a couple hours before the one you tried. I guess you managed to find an unrelated dynamic_cast issue :(

Keep trying to get a test case. If you manage to get a repro case I can take a look.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Oct 11, 2017

I've posted an update on the other bug. Now that I understand the problem better, I think you do have a bug here:

87: 00000000000198d0    16 OBJECT  WEAK   DEFAULT   18 typeinfo for A

A doesn't have a key function. You need to add a non-inline non-pure virtual function to A. If you do that, the typeinfo will be GLOBAL DEFAULT instead of WEAK DEFAULT, and then dynamic_cast should work.

@DanAlbert DanAlbert closed this Oct 11, 2017
@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Oct 12, 2017

Looks like it is not the case.

For debug purposes I've added to every class in hierarchy type describing method like:

my.h:

class Basic {
public:
  Basic();
  ~Basic();
  virtual const char* Type();
//...
}

class Derived {
public:
  Derived();
  ~Derived();
  const char* Type() override;
//...
}

my.cpp:

Basic::Basic() {}
Basic::~Basic() {}
const char* Basic::Type() { return "Basic"; }

Derived::Derived() {}
Derived::~Derived() {}
const char* Derived::Type() { return "Derived"; }

And latter in code:

void myfunc(Basic *obj)
{
    Derived* derived = dynamic_cast<Derived*>(obj);
   if (!derived)
  {
    log("Cannot cast from %s to Derived", obj->Type());
  }
}

//...
Derived obj;
myfunc(&obj);

Results:

Cannot cast from Derived to Derived

And objects do not pass through dlopen boundary. Every shared library is isolated (no internally defined types exposed) and communicate with each other only by means of simple types and some std:: types (like string & list).

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Oct 12, 2017

The above readelf listings with A/B/C classes are from my test case, and dynamic_cast works well there despite WEAK DEFAULT.

Unfortunately I still cannot reproduce the problem in test environment, this issue occurs only in production code.
I will try again...

@andreya108 andreya108 changed the title dynamic_cast form pointers from std::list is not working when linked with libc++_shared (ndk r15, r16b1) dynamic_cast form pointers is not working when linked with libc++_shared (ndk r15, r16b1) Oct 12, 2017
@DanAlbert DanAlbert reopened this Oct 12, 2017
@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Oct 12, 2017

And objects do not pass through dlopen boundary

System.loadLibrary counts. If you

System.loadLibrary("a");
System.loadLibrary("b");

and libb.so depends on liba.so, you won't be able to dynamic_cast in libb.so for any types also defined in liba.so unless the type_infos in liba.so are non-weak.

With the code above added to each of your classes, you shouldn't be getting WEAK DEFAULT symbols. They should be GLOBAL DEFAULT. That's what I see in a trivial test case locally.

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Oct 12, 2017

Both libraries depend only on system libraries and libc++_shared.so.

And do not depend on each other.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Oct 12, 2017

Yeah, we'd already more or less shown that your bug was something different than the other one, but figured it was worth checking.

@andreya108

This comment has been minimized.

Copy link
Author

@andreya108 andreya108 commented Nov 21, 2017

I'd removed all of dlopen/dlclose and the problem have gone...

When all libs are loaded once from java dynamic_cast works fine with libc++_shared.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Nov 27, 2017

That's good to hear. I'd rather have a better understanding of the problem you were encountering, but given that you have a workaround and haven't managed to work out a shareable test case, I think we should just close this. Let us know if you get more information and we'll reopen.

@DanAlbert DanAlbert closed this Nov 27, 2017
@Cristo86

This comment has been minimized.

Copy link

@Cristo86 Cristo86 commented Dec 11, 2017

Is this solved in NDK update 16.1.4479499 (updated from SDKManager in Android Studio)?

Current setup: That NDK, clang and libc++ shared.

I'm still having a null returned by dynamic_pointer_cast which with ndk12, clang and gnustl did work (e.g.):

_touchscreenVirtualPadDevice = std::dynamic_pointer_cast<TouchscreenVirtualPadDevice>(inputPlugin);

@DanAlbert

This comment has been minimized.

Copy link
Collaborator

@DanAlbert DanAlbert commented Dec 11, 2017

There's nothing we can do without a test case. If you have one, post it here and we'll reopen.

@Cristo86

This comment has been minimized.

Copy link

@Cristo86 Cristo86 commented Dec 11, 2017

Just to know if the fixes mentioned here made it to a stable r16 release. Either way definitely I'll have to isolate the problem for a test case. Thanks.

@rprichard

This comment has been minimized.

Copy link
Collaborator

@rprichard rprichard commented Dec 13, 2017

I wrote a tool that might be useful for debugging issues with RTTI and multiple C++ shared objects. It's a single-header-file C++ library that prints the shared object where an std::type_info object is located.

e.g. in @Cristo86's case, it should be possible to write something like this:

#include "rtti_dump.h"
...
// Dumps (into logcat) the shared library containing the std::type_info for the
// type we're casting *from*.
rtti_dump::dump_type(&typeid(decltype(*inputPlugin.get())), "src");

// Dumps the std::type_info for the type we're trying to cast *to*.
rtti_dump::dump_type(&typeid(TouchscreenVirtualPadDevice), "dst");

// Dumps a hierarchy of std::type_info objects, starting with the most-derived
// class of the inputPlugin object. __dynamic_cast traverses this hierarchy at
// run-time and expects to find both src and dst.
rtti_dump::dump_class_hierarchy(rtti_dump::runtime_typeid(inputPlugin.get()));

Assuming a class hierarchy like so...

struct TouchscreenVirtualPadDevice {
  virtual ~TouchscreenVirtualPadDevice() {}
};

struct OtherBase {
  virtual ~OtherBase() {}
};

struct Derived : TouchscreenVirtualPadDevice, OtherBase {};

std::shared_ptr<OtherBase> inputPlugin;

... it would dump something like this into the log:

src: type 9OtherBase:
src:     type_info obj:  0x4040d0 (in ./a.out)
src:     type_info name: 0x4040b8 (in ./a.out)
dst: type 27TouchscreenVirtualPadDevice:
dst:     type_info obj:  0x404100 (in ./a.out)
dst:     type_info name: 0x4040e0 (in ./a.out)
dump_class_hierarchy: type 7Derived:
dump_class_hierarchy:     type_info obj:  0x404080 (in ./a.out)
dump_class_hierarchy:     type_info name: 0x404060 (in ./a.out)
dump_class_hierarchy:     base classes:
dump_class_hierarchy:         type 27TouchscreenVirtualPadDevice:
dump_class_hierarchy:             type_info obj:  0x404100 (in ./a.out)
dump_class_hierarchy:             type_info name: 0x4040e0 (in ./a.out)
dump_class_hierarchy:         type 9OtherBase:
dump_class_hierarchy:             type_info obj:  0x4040d0 (in ./a.out)
dump_class_hierarchy:             type_info name: 0x4040b8 (in ./a.out)

type_info obj shows the address of an std::type_info object, and type_info name shows the address of the string returned from std::type_info::name(). In this case, all the objects are in ./a.out. TouchscreenVirtualPadDevice's type_info object is always at 0x404100, and OtherBase's object is always at 0x4040d0.

The tool is documented here. Links:

  • rtti_dump.h -- the header file

  • solib_rtti_dump.tar.gz -- the entire solib_rtti_dump directory. Has a demo showing how things can go wrong when std::type_info objects are duplicated.

Let me know if this is helpful.

@Cristo86

This comment has been minimized.

Copy link

@Cristo86 Cristo86 commented Dec 13, 2017

Great tool, thanks @rprichard. I have this output, where it seems that InputPlugin (which here is a base class for TouchscreenVirtualPadDevice) is duplicated (being in different addresses) across libs, so that may be the problem? (I'm still figuring out why libgnustl would make it anyway).

rtti_dump: src: type 11InputPlugin:
rtti_dump: src:     type_info obj:  0x77e60e7350 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinterface.so)
rtti_dump: src:     type_info name: 0x77e5f63d38 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinterface.so)
rtti_dump: dst: type 27TouchscreenVirtualPadDevice:
rtti_dump: dst:     type_info obj:  0x77e68d0ab0 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dst:     type_info name: 0x77e68b8e00 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dump_class_hierarchy: type 27TouchscreenVirtualPadDevice:
rtti_dump: dump_class_hierarchy:     type_info obj:  0x77e68d0ab0 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dump_class_hierarchy:     type_info name: 0x77e68b8e00 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dump_class_hierarchy:     base classes:
rtti_dump: dump_class_hierarchy:         type 11InputPlugin:
rtti_dump: dump_class_hierarchy:             type_info obj:  0x77e68d0450 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dump_class_hierarchy:             type_info name: 0x77e68b85d4 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
rtti_dump: dump_class_hierarchy:             base classes:
rtti_dump: dump_class_hierarchy:                 type 6Plugin:
rtti_dump: dump_class_hierarchy:                     type_info obj:  0x77e83bbe70 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
rtti_dump: dump_class_hierarchy:                     type_info name: 0x77e83a56b0 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
rtti_dump: dump_class_hierarchy:                     base classes:
rtti_dump: dump_class_hierarchy:                         type 7QObject:
rtti_dump: dump_class_hierarchy:                             type_info obj:  0x77eab2a4a8 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libQt5Core.so)
rtti_dump: dump_class_hierarchy:                             type_info name: 0x77ea9aa4f4 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libQt5Core.so)

I'll double check if for any reason I'm mistakenly building one of the libs with a different stl.

@rprichard

This comment has been minimized.

Copy link
Collaborator

@rprichard rprichard commented Dec 13, 2017

(I'm still figuring out why libgnustl would make it anyway).

gnustl treats different type_info objects as equivalent if they have the same name. libc++abi more strictly follows the "Itanium" C++ ABI. From http://itanium-cxx-abi.github.io/cxx-abi/abi.html#rtti-general: "It is intended that two type_info pointers point to equivalent type descriptions if and only if the pointers are equal. An implementation must satisfy this constraint, e.g. by using symbol preemption, COMDAT sections, or other mechanisms."

I have this output, where it seems that InputPlugin (which here is a base class for TouchscreenVirtualPadDevice) is duplicated (being in different addresses) across libs, so that may be the problem?

Yes, that's the problem. libc++abi's __dynamic_cast needs to verify that there's a public inheritance path from InputPlugin to TouchscreenVirtualPadDevice, so it searches the class hierarchy looking for InputPlugin (0x77e60e7350 in libinterface.so). It sees InputPlugin (0x77e68d0450 in libinput-plugins.so), but the addresses are different so they're considered different types.

I expect that info.path_dst_ptr_to_static_ptr will be unknown here. If the NDK's libc++abi had been compiled with _LIBCXX_DYNAMIC_FALLBACK, then in your situation, __dynamic_cast would fall back to comparing types with strings, and then __dynamic_cast would return non-NULL. (_LIBCXX_DYNAMIC_FALLBACK isn't a general fix for dynamic_cast, though. e.g. If TouchscreenVirtualPadDevice were the duplicated type instead of InputPlugin, then the fallback mode wouldn't activate, and dynamic_cast would still return NULL.)

Suggestions:

  • If you can add a "key function" to InputPlugin, that should fix the problem. A key function is a non-inline, non-pure virtual function. The compiler will output a single std::type_info object in the C++ source file where the virtual function is defined. (It will also change the readelf -s type of the std::type_info symbol from WEAK to GLOBAL.)

  • If you're OK targeting Android M and up (unlikely?), and if you can ensure that all your shared libraries are loaded in a single dlopen / System.loadLibrary call, then the system linker should generally use a single std::type_info object for each type. I don't think this fix works prior to M.

@Cristo86

This comment has been minimized.

Copy link

@Cristo86 Cristo86 commented Dec 14, 2017

InputPlugin did not have a "key function" so adding a destructor as the non-inline non-pure virtual function made it (I borrowed the idea from #533 @DanAlbert answer, as I couldn't find a reason to invent a function that wasn't there).

Before

$ aarch64-linux-android-readelf  -sW libinterface.so | aarch64-linux-android-c++filt | grep typeinfo | grep InputPlugin
    20: 00000000007ff350    24 OBJECT  WEAK   DEFAULT   18 typeinfo for InputPlugin
 10597: 000000000067bd38    14 OBJECT  WEAK   DEFAULT   11 typeinfo name for InputPlugin

After

InputPlugin.h (addition)

class InputPlugin : public Plugin {
public:
	//...
	virtual ~InputPlugin();
};

InputPlugin.cpp (just to have that destructor)

#include "InputPlugin.h"

InputPlugin::~InputPlugin() {}
$ aarch64-linux-android-readelf  -sW libinterface.so | aarch64-linux-android-c++filt | grep typeinfo | grep InputPlugin
    20: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  UND typeinfo for InputPlugin

rtti_dump output:

src: type 11InputPlugin:
src:     type_info obj:  0x77e811ba20 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
src:     type_info name: 0x77e8104fa8 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
dst: type 27TouchscreenVirtualPadDevice:
dst:     type_info obj:  0x77e682aae0 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
dst:     type_info name: 0x77e6812830 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
dump_class_hierarchy: type 27TouchscreenVirtualPadDevice:
dump_class_hierarchy:     type_info obj:  0x77e682aae0 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
dump_class_hierarchy:     type_info name: 0x77e6812830 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libinput-plugins.so)
dump_class_hierarchy:     base classes:
dump_class_hierarchy:         type 11InputPlugin:
dump_class_hierarchy:             type_info obj:  0x77e811ba20 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
dump_class_hierarchy:             type_info name: 0x77e8104fa8 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
dump_class_hierarchy:             base classes:
dump_class_hierarchy:                 type 6Plugin:
dump_class_hierarchy:                     type_info obj:  0x77e811be50 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
dump_class_hierarchy:                     type_info name: 0x77e8105620 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libplugins.so)
dump_class_hierarchy:                     base classes:
dump_class_hierarchy:                         type 7QObject:
dump_class_hierarchy:                             type_info obj:  0x77eaafd4a8 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libQt5Core.so)
dump_class_hierarchy:                             type_info name: 0x77ea97d4f4 (in /data/app/io.highfidelity.hifiinterface-1/lib/arm64/libQt5Core.so)

So that's it, InputPlugin address matches and dynamic_pointer_cast worked!

Thanks again for the explanation about type_info objects treatment by different ABIs.

@mlfarrell

This comment has been minimized.

Copy link

@mlfarrell mlfarrell commented Jun 3, 2019

I've lost hours and hours and hours and hours today trying to do this on the latest NDK.
Is this still an issue? How the heck can I pull this off????

		   System.loadLibrary("vrapi");
		   System.loadLibrary("assimp");
		   //System.loadLibrary("vglloader");
       System.loadLibrary("vglpp"); //<--- dynamic_casts for anything from this lib is broken
       System.loadLibrary("vnl");
       System.loadLibrary("vui");
            cmake {
                arguments "-DANDROID_STL=c++_shared"
                cppFlags "-std=c++14 -DANDROID=1"
            }
        }
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,-export-dynamic")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.