Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CheckConsistency fails after more then 1 files are ingested #5259

Open
00k opened this issue Apr 28, 2019 · 1 comment
Open

CheckConsistency fails after more then 1 files are ingested #5259

00k opened this issue Apr 28, 2019 · 1 comment

Comments

@00k
Copy link

00k commented Apr 28, 2019

We are running version 5.18.3. After we called 'IngestExternalFile' multiple times in a very short time, it occationally core dumped.
This is a case:
We ingest 4 files: 15450 15449 15445 15480 with global seq: 97985466 97985467 97985470 97985568

2019/04/28-12:50:28.948222 7fe6d6604700 [db/external_sst_file_ingestion_job.cc:250] [AddFile] External SST file 1163_000001.sst was ingested in L0 with path 015450.sst (global_seqno=97985466)
2019/04/28-12:50:28.958084 7fe678fe1700 [db/external_sst_file_ingestion_job.cc:250] [AddFile] External SST file 1155_000001.sst was ingested in L0 with path 015449.sst (global_seqno=97985467)
2019/04/28-12:50:30.440528 7fe6ece31700 [db/external_sst_file_ingestion_job.cc:250] [AddFile] External SST file 1208_000000.sst was ingested in L0 with path 015445.sst (global_seqno=97985470)
2019/04/28-12:50:33.563271 7fe6ece31700 [db/external_sst_file_ingestion_job.cc:250] [AddFile] External SST file 1208_000001.sst was ingested in L0 with path 015480.sst (global_seqno=97985568)

then the 4 files are compacted to file 15491:

2019/04/28-12:50:38.664788 7ff9577fe700 [db/compaction_job.cc:1688] [default] [JOB 185] Compacting 4@0 files to L0, score 10.26
2019/04/28-12:50:38.664802 7ff9577fe700 [db/compaction_job.cc:1692] [default] Compaction start summary: Base version 211 Base level 0, inputs: [15480(116MB) 15445(516MB) 15449(143MB) 15450(117MB)]
2019/04/28-12:50:49.359261 7ff9577fe700 [db/compaction_job.cc:1374] [default] [JOB 185] Generated table #15491: 893217 keys, 937243047 bytes
2019/04/28-12:50:49.370069 7ff9577fe700 [db/compaction_job.cc:1440] [default] [JOB 185] Compacted 4@0 files to L0 => 937243047 bytes

then memtable is flushed to file 15502:

2019/04/28-12:50:49.379557 7ff98d732700 [db/flush_job.cc:337] [default] [JOB 186] Level-0 flush table #15502: started
2019/04/28-12:50:49.382184 7ff98d732700 [db/flush_job.cc:377] [default] [JOB 186] Level-0 flush table #15502: 86869 bytes OK

then consistency check runs again the flushed file 15502 and failed:
db/version_builder.cc:

177           } else if (f1->fd.smallest_seqno <= f2->fd.smallest_seqno) {
178             fprintf(stderr,
179                     "L0 files seqno %" PRIu64 " %" PRIu64 " vs. %" PRIu64
180                     " %" PRIu64 "\n",
181                     f1->fd.smallest_seqno, f1->fd.largest_seqno,
182                     f2->fd.smallest_seqno, f2->fd.largest_seqno);
183             abort();
184           }

backtrace:

#2  0x0000000001ff2c99 in rocksdb::VersionBuilder::Rep::CheckConsistency (this=0x7ff958176dc0, vstorage=0x7ff9581c7de0)
    at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_builder.cc:183
#3  0x0000000001ff3bb2 in rocksdb::VersionBuilder::Rep::SaveTo (this=0x7ff958176dc0, vstorage=0x7ff9581c7de0)
    at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_builder.cc:364
#4  0x0000000001ff22ea in rocksdb::VersionBuilder::SaveTo (this=0x7ff9586e55d0, vstorage=0x7ff9581c7de0) at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_builder.cc:449
#5  0x0000000001dbf578 in rocksdb::VersionSet::ProcessManifestWrites (this=0x1c8c1f90, writers=std::deque with 1 elements = {...}, mu=0x1c8ae370, db_directory=0x42116b0, 
    new_descriptor_log=false, new_cf_options=0x0) at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_set.cc:2949
#6  0x0000000001dc1eab in rocksdb::VersionSet::LogAndApply (this=0x1c8c1f90, column_family_datas=..., mutable_cf_options_list=..., edit_lists=..., mu=0x1c8ae370, 
    db_directory=0x42116b0, new_descriptor_log=false, new_cf_options=0x0) at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_set.cc:3298
#7  0x0000000001fdf227 in rocksdb::VersionSet::LogAndApply (this=0x1c8c1f90, column_family_data=0x1c8c7890, mutable_cf_options=..., edit_list=..., mu=0x1c8ae370, 
    db_directory=0x42116b0, new_descriptor_log=false, column_family_options=0x0) at /home/likang/projects/likang/bytekv/third/rocksdb/db/version_set.h:788
(gdb) p *f1
$29 = {fd = {table_reader = 0x0, packed_number_and_path_id = 15502, file_size = 86869, smallest_seqno = 97985448, largest_seqno = 97985619}, smallest = {
    rep_ = "\000\000\000\000\000\000\000\000@\016\001\000\000\005\000\001\000\001\000\000\000\000\000\000\000\001@$\327\005\000\000"}, largest = {
    rep_ = "\351\003\000\000\000\000\000\000@user9582269196245260\251\217\331\000\037\212\231\025\001\061$\327\005\000\000"}, table_reader_handle = 0x0, stats = {
    num_reads_sampled = {<std::__atomic_base<unsigned long>> = {_M_i = 0}, <No data fields>}}, compensated_file_size = 0, num_entries = 0, num_deletions = 0, raw_key_size = 0, 
  raw_value_size = 0, refs = 2, being_compacted = false, init_stats_from_file = false, marked_for_compaction = false}
(gdb) p *f2
$30 = {fd = {table_reader = 0x7fedbf351d20, packed_number_and_path_id = 15491, file_size = 937243047, smallest_seqno = 97985466, largest_seqno = 97985568}, smallest = {
    rep_ = "\351\003\000\000\000\000\000\000@user459091711894869371\177\360\276\314\321M\231\025\001\276#\327\005\000\000"}, largest = {
    rep_ = "\351\003\000\000\000\000\000\000@user8863435406276079852\273\005\036A\234E\231\025\001\273#\327\005\000\000"}, table_reader_handle = 0x7fea60323cd0, stats = {
    num_reads_sampled = {<std::__atomic_base<unsigned long>> = {_M_i = 0}, <No data fields>}}, compensated_file_size = 937243047, num_entries = 893217, num_deletions = 0, 
  raw_key_size = 42771785, raw_value_size = 898576302, refs = 2, being_compacted = false, init_stats_from_file = true, marked_for_compaction = false}

Expected behavior

CheckConsistency should pass

Actual behavior

CheckConsistency failed

Steps to reproduce the behavior

put some data to memtable, ingest at least 2 files whose key ranges are not overlapped with memtable, then compact the ingested files, then flush the memtable.

@rhli
Copy link

rhli commented Jul 26, 2019

Hi, can you still reproduce this if using lock to serialize IngestExternalFile()? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants