-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native maps crash (segfault, SIGSEGV) on empty std::vector<BigBlock>.back() in LinkedBlockAllocator.deleteLast #767
Comments
FWIW, I think the code that calls this cleanup is making too many assumptions about how the allocator is used. Specifically, it assumes that we can know, for a fact, that we were the last ones to use it, even though we share it with C library code, whose implementations we cannot know for certain. Refactoring to avoid this assumption would be good, I think. |
I applied the following patch to try to get a sense of what was going on: diff --git a/server/native/src/main/c++/nativeMap/BlockAllocator.h b/server/native/src/main/c++/nativeMap/BlockAllocator.h
index 81c14d86ee..98772aebd3 100644
--- a/server/native/src/main/c++/nativeMap/BlockAllocator.h
+++ b/server/native/src/main/c++/nativeMap/BlockAllocator.h
@@ -43,6 +43,7 @@ struct Block {
}
void *allocate(size_t amount){
+ std::cerr<< "Block.allocate(" << amount << ")" << std::endl;
unsigned char *nextPos = currentPos + amount;
if(nextPos > end){
@@ -96,6 +97,7 @@ struct LinkedBlockAllocator {
}
void *allocate(size_t amount){
+ std::cerr<< "LinkedBlockAllocator.allocate(" << amount << ")" << std::endl;
if(amount > (size_t)bigBlockSize){
unsigned char *p = new unsigned char[amount];
@@ -121,12 +123,13 @@ struct LinkedBlockAllocator {
}
void deleteLast(void *p){
+ std::cerr<< "LinkedBlockAllocator.deleteLast(" << p << ")" << std::endl;
if(p != NULL){
if(p == lastAlloc){
blocks.back().rollback(p);
lastAlloc = NULL;
return;
- }else if(bigBlocks.back().ptr == p){
+ }else if(!bigBlocks.empty() && bigBlocks.back().ptr == p){
memused -= (sizeof(BigBlock) + bigBlocks.back().length);
bigBlocks.pop_back();
delete((unsigned char *)p);
@@ -135,7 +138,7 @@ struct LinkedBlockAllocator {
}
std::cerr << "Tried to delete something that was not last allocation " << p << " " << lastAlloc << std::endl;
- exit(-1);
+ //exit(-1);
}
size_t getMemoryUsed(){
@@ -146,7 +149,7 @@ struct LinkedBlockAllocator {
}
~LinkedBlockAllocator(){
- //std::cout << "Deleting " << blocks.size() << " blocks, memused : " << memused << std::endl;
+ std::cerr << "(~LinkedBlockAllocator) Deleting " << blocks.size() << " blocks, memused : " << memused << std::endl;
std::vector<Block>::iterator iter = blocks.begin();
while(iter != blocks.end()){
delete [] (iter->data); This just does the protective check for the empty vector, stops exiting if the last allocation wasn't what was expected, and adds a tiny bit of debugging output. The output shows a few interesting things. First, for some reason, there's allocation requests for blocks of size
To me, this looks like some alignment issue with the allocator. To reproduce, apply the patch above and run: mvn clean verify -Dtest=blah -Dit.test=ShellIT -Dspotbugs.skip -Dcheckstyle.skip
for line in $(cat test/target/mini-tests/org.apache.accumulo.harness.SharedMiniClusterBase_*/logs/TabletServer_* | grep '^Tried to delete something' | cut -f10-11 -d' ' | tr ' ' '-'); do x=$(echo $line | cut -f1 -d-); y=$(echo $line | cut -f2 -d-); echo "$line: $y - $x = $(( $y - $x ))"; done |
This fixes a bug that was found when building the native maps against Fedora 29. The cause of the bug was that an empty C++ map seemed to allocate a small amount of memory, which was a new behavior. This new behavior violated an assumption made by the custom alloctor used by the native map code. The code was restructured to avoid this issue.
On Fedora 28 and 29, with latest OpenJDK 8, TServer processes crash due to a segmentation fault in the native libs.
In BlockAllocator.h,
bigBlocks.back()
fails to check for an empty vector first. This method is undefined for an empty vector, and a behavior change seems to cause this to fail with a segmentation fault.Native stack trace from gdb:
Java stack trace for the relevant thread:
The following diff adds the explicit protection for the empty vector, and retains side-effects, but is probably not correct in context:
The text was updated successfully, but these errors were encountered: