Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlobDB crashes in sequentially fill test #6338

Closed
burtonli opened this issue Jan 28, 2020 · 2 comments
Closed

BlobDB crashes in sequentially fill test #6338

burtonli opened this issue Jan 28, 2020 · 2 comments
Assignees

Comments

@burtonli
Copy link
Contributor

Note: Please use Issues only for bug reports. For questions, discussions, feature requests, etc. post to dev group: https://www.facebook.com/groups/rocksdb.dev

Expected behavior

Test finish without crash

Actual behavior

DB crashes at
BlobFile::LinkSstFile
BlobDBImpl::LinkSstToBlobFileImpl
BlobDBImpl::ProcessCompactionJobInfo

Steps to reproduce the behavior

Run db bench with:
--benchmarks=fillseq --use_blob_db=1 --blob_db_enable_gc=1 --blob_db_gc_cutoff=0.25 --write_buffer_size=2097152 --target_file_size_base=2097152 --max_bytes_for_level_base=2097152 --num_levels=3 --compaction_style=0 --mmap_read=0 --statistics=1 --histogram=1 --num=2400000 --threads=1 --value_size=32768 --block_size=4096 --cache_size=1048576 --bloom_bits=10 --cache_numshardbits=4 --open_files=-1 --verify_checksum=1 --db=F:\RocksDBPerfTest --sync=0 --disable_wal=1 --compression_type=none --stats_interval=1000000 --compression_ratio=0.25 --max_write_buffer_number=3 --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=48 --level0_stop_writes_trigger=64 --stats_per_interval=1 --use_existing_db=0 --use_direct_reads=1 --use_direct_io_for_flush_and_compaction=1 --compaction_readahead_size=262144 --writable_file_max_buffer_size=524288 --new_table_reader_for_compaction_inputs=1

Root cause:
@ltamasi @siying

There are two code paths to maintain blob->sst mappings:

  1. Flush
    BackgroundCallFlush->BackgroundFlush->FlushMemTablesToOutputFiles->FlushMemTableToOutputFile->NotifyOnFlushCompleted->ProcessFlushJobInfo->LinkSstToBlobFile->LinkSstToBlobFileImpl

  2. Compaction
    BackgroundCallFlush->BackgroundFlush->FlushMemTablesToOutputFiles->FlushMemTableToOutputFile->SchedulePendingCompaction and MaybeScheduleFlushOrCompaction ->BlobDBListenerGC::OnCompactionCompleted->
    ProcessCompactionJobInfo

Since these paths are running in different threads, and not synchronized, so it's possible 2nd path
calls UnlinkSstFromBlobFile (for all the compaction info.input_file_infos) before sst file was linked to blob file in the 1st code path. The issue can be easily repro in sequential fill test, as all the compaction in LSMs are trivial moves, which have higher possibility to finish before BlobDBImpl::ProcessFlushJobInfo.

In the debug version, it may crash at:
void UnlinkSstFile(uint64_t sst_file_number) {
auto it = linked_sst_files_.find(sst_file_number);
assert(it != linked_sst_files_.end()); <---- crash
linked_sst_files_.erase(it);
}

In the retail version, it crashes at:
Because linked_sst_files_ became invalid after erased by linked_sst_files_.end().
void LinkSstFile(uint64_t sst_file_number) {
assert(linked_sst_files_.find(sst_file_number) == linked_sst_files_.end());
linked_sst_files_.insert(sst_file_number); <---- crash
}

@ltamasi
Copy link
Contributor

ltamasi commented Jan 28, 2020

Thanks for reporting @burtonli ! Taking a look.

@ltamasi ltamasi self-assigned this Jan 28, 2020
ltamasi added a commit to ltamasi/rocksdb that referenced this issue Feb 7, 2020
…e mapping

Summary:
BlobDB keeps track of the mapping between SSTs and blob files using
the `OnFlushCompleted` and `OnCompactionCompleted` callbacks of
the `EventListener` interface: upon receiving a flush notification, a link
is added between the newly flushed SST and the corresponding blob file;
for compactions, links are removed for the inputs and added for the outputs.
The earlier code performed this link deletion and addition even for
trivially moved files; the new code walks through the two lists together
(in a fashion that's similar to merge sort) and skips such files.
This should mitigate facebook#6338,
wherein an assertion is triggered with the earlier code when a compaction
notification for a trivial move precedes the flush notification for the
moved SST.

Test Plan:
make check
facebook-github-bot pushed a commit that referenced this issue Feb 7, 2020
…e mapping (#6381)

Summary:
BlobDB keeps track of the mapping between SSTs and blob files using
the `OnFlushCompleted` and `OnCompactionCompleted` callbacks of
the `EventListener` interface: upon receiving a flush notification, a link
is added between the newly flushed SST and the corresponding blob file;
for compactions, links are removed for the inputs and added for the outputs.
The earlier code performed this link deletion and addition even for
trivially moved files; the new code walks through the two lists together
(in a fashion that's similar to merge sort) and skips such files.
This should mitigate #6338,
wherein an assertion is triggered with the earlier code when a compaction
notification for a trivial move precedes the flush notification for the
moved SST.
Pull Request resolved: #6381

Test Plan: make check

Differential Revision: D19773729

Pulled By: ltamasi

fbshipit-source-id: ae0f273ded061110dd9334e8fb99b0d7786650b0
@ltamasi
Copy link
Contributor

ltamasi commented Mar 22, 2021

Should be fixed now (see 1b4be4c).

@ltamasi ltamasi closed this as completed Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants