Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure in GetFilter() when opening db #2717

Closed
merlimat opened this issue Aug 10, 2017 · 22 comments
Closed

Assertion failure in GetFilter() when opening db #2717

merlimat opened this issue Aug 10, 2017 · 22 comments
Assignees

Comments

@merlimat
Copy link
Contributor

merlimat commented Aug 10, 2017

I am seeing assertion failures when opening a db that was not gracefully closed.

Assertion failed: (filter->size() > 0), function GetFilter, file table/block_based_table_reader.cc, line 1133.
Abort trap: 6

Db Logs prior to assert failure:

2017/08/07-14:16:55.549142 700000589000 [db/version_set.cc:2882] Recovered from manifest file:data/standalone/bookkeeper0/current/locations/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0
2017/08/07-14:16:55.549153 700000589000 [db/version_set.cc:2890] Column family [default] (ID 0), log number is 0
2017/08/07-14:16:55.549239 700000589000 EVENT_LOG_v1 {"time_micros": 1502140615549232, "job": 1, "event": "recovery_started", "log_files": [3]}
2017/08/07-14:16:55.549245 700000589000 [db/db_impl_open.cc:484] Recovering log #3 mode 2

Stack trace:

 thread #6, stop reason = signal SIGSTOP
    frame #0: 0x00007fffd445cd42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fffd454a457 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fffd43c2420 libsystem_c.dylib`abort + 129
    frame #3: 0x00007fffd4389893 libsystem_c.dylib`__assert_rtn + 320
    frame #4: 0x000000013da81279 librocksdbjni1259198247081385980.jnilib`rocksdb::BlockBasedTable::GetFilter(rocksdb::BlockHandle const&, bool, bool) const + 1017
    frame #5: 0x000000013da7e202 librocksdbjni1259198247081385980.jnilib`rocksdb::BlockBasedTable::Open(rocksdb::ImmutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::__1::unique_ptr<rocksdb::RandomAccessFileReader, std::__1::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long long, std::__1::unique_ptr<rocksdb::TableReader, std::__1::default_delete<rocksdb::TableReader> >*, bool, bool, int) + 4338
    frame #6: 0x000000013da7958a librocksdbjni1259198247081385980.jnilib`rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::TableReaderOptions const&, std::__1::unique_ptr<rocksdb::RandomAccessFileReader, std::__1::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long long, std::__1::unique_ptr<rocksdb::TableReader, std::__1::default_delete<rocksdb::TableReader> >*, bool) const + 90
    frame #7: 0x000000013d996912 librocksdbjni1259198247081385980.jnilib`rocksdb::TableCache::GetTableReader(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, bool, unsigned long, bool, rocksdb::HistogramImpl*, std::__1::unique_ptr<rocksdb::TableReader, std::__1::default_delete<rocksdb::TableReader> >*, bool, int, bool) + 546
    frame #8: 0x000000013d996dfa librocksdbjni1259198247081385980.jnilib`rocksdb::TableCache::FindTable(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, bool, bool, rocksdb::HistogramImpl*, bool, int, bool) + 602
    frame #9: 0x000000013d99716a librocksdbjni1259198247081385980.jnilib`rocksdb::TableCache::NewIterator(rocksdb::ReadOptions const&, rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::RangeDelAggregator*, rocksdb::TableReader**, rocksdb::HistogramImpl*, bool, rocksdb::Arena*, bool, int) + 394
    frame #10: 0x000000013d85b35d librocksdbjni1259198247081385980.jnilib`rocksdb::BuildTable(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::__1::unique_ptr<rocksdb::InternalIterator, std::__1::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::__1::vector<std::__1::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::__1::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::__1::allocator<std::__1::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::__1::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<unsigned long long, std::__1::allocator<unsigned long long> >, unsigned long long, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int) + 3981
    frame #11: 0x000000013d912099 librocksdbjni1259198247081385980.jnilib`rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*) + 1337
    frame #12: 0x000000013d910563 librocksdbjni1259198247081385980.jnilib`rocksdb::DBImpl::RecoverLogFiles(std::__1::vector<unsigned long long, std::__1::allocator<unsigned long long> > const&, unsigned long long*, bool) + 2355
    frame #13: 0x000000013d90f3ca librocksdbjni1259198247081385980.jnilib`rocksdb::DBImpl::Recover(std::__1::vector<rocksdb::ColumnFamilyDescriptor, std::__1::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool) + 1370
    frame #14: 0x000000013d91371e librocksdbjni1259198247081385980.jnilib`rocksdb::DB::Open(rocksdb::DBOptions const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<rocksdb::ColumnFamilyDescriptor, std::__1::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::__1::vector<rocksdb::ColumnFamilyHandle*, std::__1::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**) + 2238
    frame #15: 0x000000013d912c27 librocksdbjni1259198247081385980.jnilib`rocksdb::DB::Open(rocksdb::Options const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, rocksdb::DB**) + 1351
    frame #16: 0x000000013d83610c librocksdbjni1259198247081385980.jnilib`___lldb_unnamed_symbol106$$librocksdbjni1259198247081385980.jnilib + 28
    frame #17: 0x000000013d8302a3 librocksdbjni1259198247081385980.jnilib`rocksdb_open_helper(JNIEnv_*, long, _jstring*, std::__1::function<rocksdb::Status (rocksdb::Options const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, rocksdb::DB**)>) + 147
    frame #18: 0x000000013d8303a9 librocksdbjni1259198247081385980.jnilib`Java_org_rocksdb_RocksDB_open__JLjava_lang_String_2 + 73

Db archive where the problem reproduces: https://www.dropbox.com/sh/kyfotpyqgvbrsce/AADxlJo8OX0ZPWIiqCVfaL_Ia?dl=0&preview=locations.tar.gz

Tried with RocksDB: 5.5.5 and 5.6.1 with RocksDB-JNI from MacOS.

@yiwu-arbug
Copy link
Contributor

@maysamyabandeh any quick thoughts?

@merlimat
Copy link
Contributor Author

Tell me if you need any other information that might help here.

@maysamyabandeh
Copy link
Contributor

./sst_dump --file=/tmp/locations/000004.sst --show_properties | grep "filter block size"
  filter block size: 0
./sst_dump --file=/tmp/locations/000004.sst --show_properties | grep "data blocks"
  # data blocks: 0

The sst file is genuinely empty. The bug must be about dumping an empty sst file after a sudden crash.

@merlimat
Copy link
Contributor Author

@maysamyabandeh This error keeps popping up from time to time even in newer RocksDB version (currently we're using 5.8.6) though I haven't found a reliable way to reproduce it.

Under which condition can the SST be dumped as empty? Is there anything that can be done to simply ignore an empty SST left after a crash?

@sijie
Copy link

sijie commented Jan 10, 2018

@maysamyabandeh if the problem is coming from an empty sst file, is it possible to get around this issue, for example: skip the empty sst files?

@maysamyabandeh
Copy link
Contributor

If having an empty SST file is a legit scenario (and not itself a symptom of another issue), yes we should skip it on restart.

@sijie
Copy link

sijie commented Apr 25, 2018

@maysamyabandeh any updates of fixing this issue?

@ajkr
Copy link
Contributor

ajkr commented Apr 25, 2018

@sijie Do you use DeleteRange? That is one case where we may legitimately output an SST file without any data blocks and an empty filter block.

@ajkr
Copy link
Contributor

ajkr commented Apr 25, 2018

#3614 prevents us from creating SST files with empty filters. But it'd still fail to read files with empty filters, for example, if you try to read an old DB where such files already exist. We should really fix that...

@merlimat
Copy link
Contributor Author

@sijie Do you use DeleteRange? That is one case where we may legitimately output an SST file without any data blocks and an empty filter block.

Yes, we keep indexes for BookKeeper entries in RocksDB as individual records and then delete them with deleteRange().

#3614 prevents us from creating SST files with empty filters.

Nice! Was that change already released?

@ajkr
Copy link
Contributor

ajkr commented Apr 25, 2018

Looks like it just missed the cutoff for 5.13 so not released yet. But it's a bugfix so we should be able to backport the patch. Is backporting to 5.13 enough?

@merlimat
Copy link
Contributor Author

That would be great. We can pickup any release of rocksdbjni that's available in Maven central.
Thanks!

@sijie
Copy link

sijie commented Apr 26, 2018

thanks @ajkr

@merlimat
Copy link
Contributor Author

@ajkr Do know what's the ETA for having 5.13.1 available in Maven? The last available version is still 5.11 at this point.

sijie pushed a commit to apache/bookkeeper that referenced this issue Jun 6, 2018
Upgrade RocksDB version to include a fix for empty SSTs written by flushing with deleteRange() operations that cause assertion failures on DB open.

Description can be found at : facebook/rocksdb#2717

Author: Matteo Merli <mmerli@apache.org>

Reviewers: Jia Zhai <None>, Sijie Guo <sijie@apache.org>

This closes #1466 from merlimat/upgrade-rocksdb
sijie pushed a commit to apache/bookkeeper that referenced this issue Jun 6, 2018
Upgrade RocksDB version to include a fix for empty SSTs written by flushing with deleteRange() operations that cause assertion failures on DB open.

Description can be found at : facebook/rocksdb#2717

Author: Matteo Merli <mmerli@apache.org>

Reviewers: Jia Zhai <None>, Sijie Guo <sijie@apache.org>

This closes #1466 from merlimat/upgrade-rocksdb

(cherry picked from commit 7bbd3de)
Signed-off-by: Sijie Guo <sijie@apache.org>
@ajkr ajkr closed this as completed Jun 6, 2018
@ajkr
Copy link
Contributor

ajkr commented Jun 6, 2018

Looks like I didn't backport to 5.13. It's done now: https://github.com/facebook/rocksdb/commits/5.13.fb. Unfortunately the fix will be in 5.13.3 though.

@ajkr ajkr reopened this Jun 6, 2018
@ajkr
Copy link
Contributor

ajkr commented Jun 6, 2018

cc @adamretter who does our Java releases. Actually I'm a bit confused as I thought those release builds wouldn't have assertions enabled?

@adamretter
Copy link
Collaborator

@ajkr when 5.13.3 is ready let me know and I can release it.

When you say "assertions enabled", I assume you mean the C++ assert() macro? If so I am not sure why they are enabled, the Java build's build the RocksDB C++ part for for Linux/Unix/Mac using PORTABLE=1 which I was led to believe was the correct approach. Was the problem in the actual release version from Maven Central or in a custom build?

@ajkr
Copy link
Contributor

ajkr commented Jun 6, 2018

@adamretter The current head of 5.13.fb should be fine to release as 5.13.3.

Yes I mean failing on C++ assertion. @merlimat How is your application using RocksDB when it hits the assertion failure? Is it using the maven central version or something else?

@merlimat
Copy link
Contributor Author

merlimat commented Jun 6, 2018

@ajkr Yes, we have a dependency on rocksdbjni that's coming from Maven central.

@ajkr
Copy link
Contributor

ajkr commented Jun 6, 2018

@adamretter To disable assertions we usually set DEBUG_LEVEL=0. It looks like in the Makefile this is done automatically for some java build targets but not all (for example it doesn't look like it's done for "rocksdbjava").

@merlimat
Copy link
Contributor Author

Thanks @ajkr!

sijie pushed a commit to apache/pulsar that referenced this issue Jun 20, 2018
### Motivation

Fixes #1049 . The actual fix was included in RocksDB-5.13.3 facebook/rocksdb#2717 and it's now available in Maven central.

The commit that fixed the assertion for empty SSTs was at facebook/rocksdb@8d73964
@adamretter
Copy link
Collaborator

adamretter commented Jun 22, 2018

@ajkr so the target we use for release builds of RocksJava is PORTABLE=1 rocksdbjavastaticreleasedocker.

This uses the rocksdbjavastatic target directly for the Linux builds which are done inside Docker, which as far as I can tell does set DEBUG_LEVEL=0 when you call make rocksdbjavastatic.

However for the Mac build the rocksdbjavastatic is a sub-target of rocksdbjavastaticreleasedocker and so DEBUG_LEVEL=0 is not set by the Makefile... I will fix that.

@adamretter adamretter reopened this Jun 22, 2018
sagar0 pushed a commit that referenced this issue Jun 30, 2018
Summary:
Closes #2717
Closes #4040

Differential Revision: D8592058

Pulled By: sagar0

fbshipit-source-id: d01099a1067aa32659abb0b4bed641d919a3927e
rcane pushed a commit to rcane/rocksdb that referenced this issue Sep 13, 2018
Summary:
Closes facebook#2717
Closes facebook#4040

Differential Revision: D8592058

Pulled By: sagar0

fbshipit-source-id: d01099a1067aa32659abb0b4bed641d919a3927e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants