Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert fail #5126

Open
bilifornew opened this issue Mar 29, 2019 · 4 comments
Open

assert fail #5126

bilifornew opened this issue Mar 29, 2019 · 4 comments

Comments

@bilifornew
Copy link

assert(cfd->imm()->IsFlushPending());

when rocksdb do flush job, it holds all column family which need to be flushed, and then flush column family one by one.

consider below scenario:

  1. flush thread 1 pick column family a、b、c, then flush a (release the mutex lock).
  2. user write column family c and memtable full, column family c will be inserted into flush queue.
  3. flush thread 2 pick column family c, then flush c (pick all available memtables)
  4. flush thread 1 done column family a、b
  5. flush thread 2 done column c
  6. flush thread begin to do column family c, no available memtable to flush, assert fail
@riversand963
Copy link
Contributor

In MemTableList::PickMemtablesToFlush, memtables that have already been picked by flush thread 1 will not be picked by flush thread 2. Therefore, when flush thread 1 starts to flush column family c, it should still be able to see that it needs to flush some memtables.
Have you encountered such an assertion failure? If so, could you present a unit test that can reproduce this interleaved execution? That will be very helpful. :) Thanks!

@bilifornew
Copy link
Author

bilifornew commented Apr 1, 2019

When thread 1 starts to flush column family c, all memtables has been flushed by thread 2 though thread 1 is scheduled earlier (in flush job max_memtable_id is set to null).

We encounter this fail with v5.18.3, when we rollback to v5.16.6 everything is ok.

Before v5.17.2, one flush thread only process one column family, we have not tested v5.17.2.

@riversand963
Copy link
Contributor

Do you have a test that can reproduce this? Also, do you have the LOG file when this occurred?

@iFA88
Copy link

iFA88 commented Mar 12, 2020

Hey, I'm also receiving this error few times while writing WriteBuffer with many CF. Its breaks everything.

python: /home/rocksdb-6.6.4/db/db_impl/db_impl_compaction_flush.cc:147: rocksdb::Status rocksdb::DBImpl::FlushMemTableToOutputFile(rocksdb::ColumnFamilyData*, const rocksdb::MutableCFOptions&, bool*, rocksdb::JobContext*, rocksdb::SuperVersionContext*, std::vector<long unsigned int>&, rocksdb::SequenceNumber, rocksdb::SnapshotChecker*, rocksdb::LogBuffer*, rocksdb::Env::Priority): Assertion `cfd->imm()->IsFlushPending()' failed.

In syslog/kernlog is nothing, mcelog (EEC memory error log) is empty.

LOG and OPTIONS: https://www.fusionsolutions.io/doc/rocksdbissue.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants