-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in createFilter() #425
Comments
I did a bisect. The core dump first shows up with this commit: |
From the stack trace, one potential explanation is that the caller is
This would explain the crash you are seeing since background threads inside On Sun, Nov 13, 2016 at 7:36 PM, michihenning notifications@github.com
|
Also, an easy thing to do might be to try running under address sanitizer On Sun, Nov 13, 2016 at 7:47 PM, Sanjay Ghemawat sanjay@google.com wrote:
|
Thanks for the quick response! We are not using any filter policy. We use ReadOptions and WriteOptions. Both have static storage duration. For the read options, we set: For the write options, we don't set anything, the variable is default-constructed. We also pass an instance of leveldb::Options to DB::Open(). That option instance is on the stack, and the only flags set for it are options.paranoid_checks = true; Could any of these have something to do with this? |
One other point that might be relevant: we have three instances of the DB open, each one for a different purpose. |
Hmmm... Scanning through our code, we also use options.block_cache = ...; The life time of that block cache ends before the DB is destroyed. Could this be the cause? |
Address sanitizer says this: The address on the first line isn't always 0x0. It varies from run to run. About half the time, I get 0x0, the other half, I get this instead (the stack trace is the same otherwise): ==107173==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x7fcf567be24a bp 0x60d00002d7a0 sp 0x7fcf41958f80 T5) |
On Sun, Nov 13, 2016 at 8:27 PM, michihenning notifications@github.com
Could be. But from the code I found on the web, we have std::unique_ptrleveldb::Cache block_cache_; // Must be defined before db_! That should prevent the block_cache_ from being deleted until after the db_ I think the first thing to do would be to get more information from the It would also be good to know where the main thread was when the ASAN —
|
My apologies about the block_cache_ life time, I wasn't thinking straight. Here is the asan output with line numbers: Here are the stacks for all the threads. The main thread is in the DB destructor, there are four threads waiting for an event to arrive, and there is DB background thread that hits the segfault. |
BTW, to avoid any confusion, this is with leveldb compiled from commit ac1d69d |
I am confused by the two occurrences of ~DBImpl in the stack trace:
The top one is in the following loop: The second one (the caller???) is the end of the destructor: Is the program perhaps throwing exceptions when some leveldb library Perhaps you could fprintfs in various parts of ~DBImpl to figure out which And maybe compile without optimization. On Mon, Nov 14, 2016 at 1:47 PM, michihenning notifications@github.com
|
There are no exceptions anywhere near this.
I've seen this before--I think it's just an idiosyncracy of gcc/gdb with inlined destructors? I inserted some trace into the constructor and destructor: DBImpl::~DBImpl() {
std::cerr << "~DBImpl()" << std::endl;
// Wait for background work to finish
mutex_.Lock();
shutting_down_.Release_Store(this); // Any non-NULL value is ok
std::cerr << "called Release_Store()" << std::endl;
while (bg_compaction_scheduled_) {
std::cerr << "waiting" << std::endl;
bg_cv_.Wait();
}
mutex_.Unlock();
std::cerr << "done waiting" << std::endl; We have three instances of PersistentCache in the code; here is the trace I get:
|
…oogle#427) While the percentages are displayed for both of the columns, the old/new values are only displayed for the second column, for the CPU time. And the column is not even spelled out. In cases where b->UseRealTime(); is used, this is at the very least highly confusing. So why don't we just display both the old/new for both the columns? Fixes google#425
I'm getting segfaults with leveldb 1.19. This is in code that has been working perfectly for more than a year with 1.18. The segfault happens some time during shutdown and looks like a race condition. Valgrind has this to say:
http://pastebin.ubuntu.com/23459626/
gdb shows this stack trace:
http://pastebin.ubuntu.com/23459624/
I can produce the segfault on arm64, amd64, powerpc, s390x, ppc64el, armhf, and i386 simply by changing from libleveldb.so.1.18 to libleveldb.so.1.19
More details here: https://bugs.launchpad.net/ubuntu/+source/thumbnailer/+bug/1640326
The text was updated successfully, but these errors were encountered: