Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in Throughput.cpp #76

Closed
TylerADavis opened this issue Jul 10, 2017 · 4 comments
Closed

Segfault in Throughput.cpp #76

TylerADavis opened this issue Jul 10, 2017 · 4 comments
Assignees
Labels

Comments

@TylerADavis
Copy link
Collaborator

Throughput.cpp benchmark crashes to to a segfault when attempting the 10 MB put test.

The offending line is

test_throughput<10   * 1024 * 1024>(num_runs / 100);

GDB output:
screen shot 2017-07-10 at 12 35 15 pm

@jcarreira
Copy link
Owner

Not clear what the cause of this might be. Memory corruption might be a cause. Valgrind can help here (though the ibverbs stack emits a lot of false positives).

@TylerADavis
Copy link
Collaborator Author

I'll give that a try. This error happens on TCP as well, so I'll try running it without the RDMA stack.

@TylerADavis
Copy link
Collaborator Author

TylerADavis commented Jul 11, 2017

I ran with valgrind, and it seems that there were no memory issues until the segfault itself. The output I got is below. Is the warning about the SP changing indicative of a stack overflow? It does mention it is a possibility.

[tylerdavis@f1:/data/tyler/ddc]$ valgrind --leak-check=yes ./benchmarks/throughput
==12246== Memcheck, a memory error detector
==12246== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12246== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==12246== Command: ./benchmarks/throughput
==12246== 
==12246== Warning: client switching stacks?  SP change: 0xfff0009b8 --> 0xffe600470
==12246==          to suppress, use: --max-stackframe=10487112 or greater
==12246== Invalid write of size 4
==12246==    at 0x405067: void test_throughput<10485760ul>(int) (throughput.cpp:41)
==12246==  Address 0xffe60047c is on thread 1's stack
==12246== 
==12246== 
==12246== Process terminating with default action of signal 11 (SIGSEGV)
==12246==  Access not within mapped region at address 0xFFE60047C
==12246==    at 0x405067: void test_throughput<10485760ul>(int) (throughput.cpp:41)
==12246==  If you believe this happened as a result of a stack
==12246==  overflow in your program's main thread (unlikely but
==12246==  possible), you can try to increase the size of the
==12246==  main thread stack using the --main-stacksize= flag.
==12246==  The main thread stack size used in this run was 8388608.
==12246== Invalid write of size 8
==12246==    at 0x4A28680: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==12246==  Address 0xffe600468 is on thread 1's stack
==12246== 
==12246== 
==12246== Process terminating with default action of signal 11 (SIGSEGV)
==12246==  Access not within mapped region at address 0xFFE600468
==12246==    at 0x4A28680: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==12246==  If you believe this happened as a result of a stack
==12246==  overflow in your program's main thread (unlikely but
==12246==  possible), you can try to increase the size of the
==12246==  main thread stack using the --main-stacksize= flag.
==12246==  The main thread stack size used in this run was 8388608.
==12246== 
==12246== HEAP SUMMARY:
==12246==     in use at exit: 72,704 bytes in 1 blocks
==12246==   total heap usage: 1 allocs, 0 frees, 72,704 bytes allocated
==12246== 
==12246== LEAK SUMMARY:
==12246==    definitely lost: 0 bytes in 0 blocks
==12246==    indirectly lost: 0 bytes in 0 blocks
==12246==      possibly lost: 0 bytes in 0 blocks
==12246==    still reachable: 72,704 bytes in 1 blocks
==12246==         suppressed: 0 bytes in 0 blocks
==12246== Reachable blocks (those to which a pointer was found) are not shown.
==12246== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==12246== 
==12246== For counts of detected and suppressed errors, rerun with: -v
==12246== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

@TylerADavis
Copy link
Collaborator Author

I found that the segfault was resulting from an std::array that was too large for the stack. I've fixed it in #75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants