Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native maps crash (segfault) when calling LinkedBlockAllocator.deleteLast() #1064

Closed
ctubbsii opened this issue Mar 29, 2019 · 4 comments

Comments

@ctubbsii
Copy link
Member

commented Mar 29, 2019

There appears to have been another change in the gcc STL implementation of Vector, used by our LinkedBlockAllocator (similar to #767). libstdc++ 8.3.1, the following segfault occurs when running NativeMapIT.

(gdb) bt
#0  0x00007f8a37c6d53f in raise () from /lib64/libc.so.6
#1  0x00007f8a37c57895 in abort () from /lib64/libc.so.6
#2  0x00007f8a37083f7f in os::abort (dump_core=<optimized out>)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.201.b09-2.fc29.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1575
#3  0x00007f8a37960ca3 in VMError::report_and_die (this=this@entry=0x7f8a36b2f010)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.201.b09-2.fc29.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1107
#4  0x00007f8a37755846 in JVM_handle_linux_signal (sig=sig@entry=11, info=info@entry=0x7f8a36b2f2b0, 
    ucVoid=ucVoid@entry=0x7f8a36b2f180, abort_if_unrecognized=abort_if_unrecognized@entry=1)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.201.b09-2.fc29.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#5  0x00007f8a3774888c in signalHandler (sig=11, info=0x7f8a36b2f2b0, uc=0x7f8a36b2f180)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.201.b09-2.fc29.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4555
#6  <signal handler called>
#7  LinkedBlockAllocator::deleteLast (p=0x7f8a33db2267, this=0x7f8a30bc7770) at /usr/include/c++/8/bits/stl_iterator.h:844
#8  SubKey::clear (this=<synthetic pointer>, lba=0x7f8a30bc7770) at nativeMap/SubKey.h:128
#9  NativeMap::update (mutationCount=0, val=0x7f8a36b2f888, del=<optimized out>, ts=4555391702396626017, cv=0x7f8a36b2f848, 
    cq=0x7f8a36b2f840, cf=0x7f8a36b2f838, env=0x7f8a3004ca60, cm=<optimized out>, this=0x7f8a30b300a0) at nativeMap/NativeMap.h:164
#10 Java_org_apache_accumulo_tserver_NativeMap_update (env=0x7f8a3004ca60, cls=<optimized out>, nm=140231499251872, 
    uid=140231545385440, cf=0x7f8a36b2f838, cq=0x7f8a36b2f840, cv=0x7f8a36b2f848, ts=4555391702396626017, del=0 '\000', 
    val=0x7f8a36b2f888, mutationCount=0) at nativeMap/org_apache_accumulo_tserver_NativeMap.cc:57
#11 0x00007f8a212ff4bc in ?? ()
#12 0x00007f8a36b2f848 in ?? ()
#13 0x3f38009e2a1ce461 in ?? ()
#14 0x0000000000000000 in ?? ()
@ctubbsii

This comment has been minimized.

Copy link
Member Author

commented Mar 29, 2019

So, it looks like the problem is that we do not guard our vector against calls to back() when it is empty. When empty, vector.back() is undefined. Apparently, this changed in behavior between libtstdc++ 8.2 and 8.3... but it shouldn't have mattered. We should not have been trying to call back() when the vector was empty.

Something in our code assumes that bigBlocks will not be empty when LinkedBlockAllocator::deleteLast is called, but that assumption is wrong in some circumstances. I haven't yet figured out which circumstances, but the following guard does cause the test to work:

diff --git a/server/native/src/main/c++/nativeMap/BlockAllocator.h b/server/native/src/main/c++/nativeMap/BlockAllocator.h
index b7eb60e22d..da8e53fa0d 100644
--- a/server/native/src/main/c++/nativeMap/BlockAllocator.h
+++ b/server/native/src/main/c++/nativeMap/BlockAllocator.h
@@ -124,6 +124,8 @@ struct LinkedBlockAllocator {
         blocks.back().rollback(p);
         lastAlloc = NULL;
         return;
+      }else if(bigBlocks.empty()){
+        return;
       }else if(bigBlocks.back().ptr == p){
         memused -= (sizeof(BigBlock) + bigBlocks.back().length);
         bigBlocks.pop_back();

See: http://www.cplusplus.com/reference/vector/vector/back/

@ctubbsii

This comment has been minimized.

Copy link
Member Author

commented Mar 29, 2019

@keith-turner Could you take a look at this?

@phrocker

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2019

Are you testing with -Xcheck:jni? It will help identify issues with managing objects lifetimes

@ctubbsii

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2019

@phrocker I haven't tried that. Would that flag help if the object itself isn't a JNI object, but a struct managed entirely within the logic of the native implementation (vs. something being exposed via the JNI interfaces)? I believe that's the case here.

@ctubbsii ctubbsii added this to Done in 1.9.3 Jun 14, 2019

@ctubbsii ctubbsii added this to Done in 2.0.0 Jun 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
2 participants
You can’t perform that action at this time.