Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes when linking against MaterialX libs from multiple shared libraries #485

Closed
sunyab opened this issue Sep 17, 2020 · 6 comments
Closed

Comments

@sunyab
Copy link

sunyab commented Sep 17, 2020

On Linux with g++ 6.3.1, when I link the MaterialX static libraries into two separate shared libraries, then use both of those shared libraries in a program, the program crashes at exit. I get the following diagnostic in the terminal:

bash-4.2$ ./test
Creating MaterialX document in TestA
Creating MaterialX document in TestB
*** Error in `./test': double free or corruption (!prev): 0x0000000002055c30 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81299)[0x7f95634d5299]
/home/sunya/projects/HYD-2108/libtestA.so(_ZN9__gnu_cxx13new_allocatorIPNSt8__detail15_Hash_node_baseEE10deallocateEPS3_m+0x20)[0x7f95648c80ee]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt16allocator_traitsISaIPNSt8__detail15_Hash_node_baseEEE10deallocateERS3_PS2_m+0x2b)[0x7f95648c777c]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt8__detail16_Hashtable_allocISaINS_10_Hash_nodeISt4pairIKSsPFSt10shared_ptrIN9MaterialX5ValueEERS3_EELb1EEEEE21_M_deallocate_bucketsEPPNS_15_Hash_node_baseEm+0x5a)[0x7f956498046c]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt10_HashtableISsSt4pairIKSsPFSt10shared_ptrIN9MaterialX5ValueEERS1_EESaIS9_ENSt8__detail10_Select1stESt8equal_toISsESt4hashISsENSB_18_Mod_range_hashingENSB_20_Default_ranged_hashENSB_20_Prime_rehash_policyENSB_17_Hashtable_traitsILb1ELb0ELb1EEEE21_M_deallocate_bucketsEPPNSB_15_Hash_node_baseEm+0x42)[0x7f956497f0c6]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt10_HashtableISsSt4pairIKSsPFSt10shared_ptrIN9MaterialX5ValueEERS1_EESaIS9_ENSt8__detail10_Select1stESt8equal_toISsESt4hashISsENSB_18_Mod_range_hashingENSB_20_Default_ranged_hashENSB_20_Prime_rehash_policyENSB_17_Hashtable_traitsILb1ELb0ELb1EEEE21_M_deallocate_bucketsEv+0x2a)[0x7f956497e37c]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt10_HashtableISsSt4pairIKSsPFSt10shared_ptrIN9MaterialX5ValueEERS1_EESaIS9_ENSt8__detail10_Select1stESt8equal_toISsESt4hashISsENSB_18_Mod_range_hashingENSB_20_Default_ranged_hashENSB_20_Prime_rehash_policyENSB_17_Hashtable_traitsILb1ELb0ELb1EEEED1Ev+0x24)[0x7f956497da56]
/home/sunya/projects/HYD-2108/libtestA.so(_ZNSt13unordered_mapISsPFSt10shared_ptrIN9MaterialX5ValueEERKSsESt4hashISsESt8equal_toISsESaISt4pairIS4_S7_EEED1Ev+0x18)[0x7f956498e606]
/lib64/libc.so.6(__cxa_finalize+0x9a)[0x7f956348e05a]
/home/sunya/projects/HYD-2108/libtestB.so(+0x263853)[0x7f95642a4853]

To reproduce this, you can unpack the attached file, modify the build.sh script to point to your MaterialX install, then run it and the produced test program. repro.zip

I don't think this is an issue specific to MaterialX itself. I'm not a linker expert but I think this is an expected gotcha with using static libraries with static variables with external linkage. I believe this could be avoided if I linked against a shared library build of MaterialX, but it appears only static builds are supported.

Could MaterialX allow shared library builds? Right now static builds are forced via the "STATIC" keyword specified in the add_library calls in the various CMakeList.txt files, but removing that let me build shared libraries that seemed to work OK in my limited testing.

@sunyab
Copy link
Author

sunyab commented Sep 17, 2020

One obstacle to the shared library builds could be the need to explicitly export public API on Windows.

@jstone-lucasfilm
Copy link
Member

@sunyab Thanks for this detailed report! The first question that comes to my mind is: which global variable in the MaterialX library has external linkage? As a rule, we've aimed to declare all global variables as static or within anonymous namespaces (which should have the effect of giving them internal linkage in C++11). Do you happen to have access to the name of the global variable that is being destructed twice in your example?

@jstone-lucasfilm
Copy link
Member

One possibility is that it's one of the two CreatorMap objects, which are declared as private static members of the Element and Value classes. If that's the case, then we could move these two variables to anonymous namespaces within Element.cpp and Value.cpp, which would give them internal linkage. Let us know, though, what variables seem to be triggering the crash in your example, and we'll proceed from there.

@sunyab
Copy link
Author

sunyab commented Sep 18, 2020

In the repro case I posted I was unfortunately not able to figure out which string(s) were triggering this, but valgrind did report invalid accesses to MaterialX::ElementRegistry and MaterialX::ValueRegistry, which sounds like they could be the private static variables you mentioned. In addition, valgrind reported errors in the static destructor for Traversal.cpp and XmlIo.cpp. Here's the full valgrind report in case its useful: valgrind.txt

I also ran valgrind on an internal test case and it appeared MaterialX::UnitDef::UNITTYPE_ATTRIBUTE was one of the strings that valgrind flagged with an invalid free.

@jstone-lucasfilm
Copy link
Member

Initial support for shared libraries on Linux and MacOS has been added in #487. Let us know if this addresses the issue you're running into.

@jstone-lucasfilm
Copy link
Member

@sunyab I'll close this issue out for now, and let us know if additional shared library functionality is needed in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants