Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corruption may occur for GC with a concurrent DeleteFilesInRange. #167

Closed
Connor1996 opened this issue May 28, 2020 · 0 comments · Fixed by #168
Closed

Corruption may occur for GC with a concurrent DeleteFilesInRange. #167

Connor1996 opened this issue May 28, 2020 · 0 comments · Fixed by #168
Assignees

Comments

@Connor1996
Copy link
Member

Connor1996 commented May 28, 2020

tikv panic due to titan deletes a already deleted blob

[2020/05/27 21:31:06.177 +08:00] [FATAL] [lib.rs:480] ["rocksdb background error. db: kv, reason: compaction, error: Corruption: B
lob file 256 has been deleted already"] [backtrace="stack backtrace:\n   0: tikv_util::set_panic_hook::{{closure}}\n             a
t components/tikv_util/src/lib.rs:479\n   1: std::panicking::rust_panic_with_hook\n             at src/libstd/panicking.rs:475\n
 2: rust_begin_unwind\n             at src/libstd/panicking.rs:375\n   3: std::panicking::begin_panic_fmt\n             at src/lib
std/panicking.rs:326\n   4: <engine_rocks::event_listener::RocksEventListener as rocksdb::event_listener::EventListener>::on_backg
round_error\n             at components/engine_rocks/src/event_listener.rs:66\n   5: rocksdb::event_listener::on_background_error\
n             at /rust/git/checkouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/src/event_listener.rs:254\n   6: _ZN24crocksdb_eventlis
tener_t17OnBackgroundErrorEN7rocksdb21BackgroundErrorReasonEPNS0_6StatusE\n             at crocksdb/c.cc:2140\n   7: _ZN7rocksdb7t
itandb11TitanDBImpl10SetBGErrorERKNS_6StatusE\n             at /rust/git/checkouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/librocksd
b_sys/libtitan_sys/titan/src/db_impl.cc:1323\n   8: _ZN7rocksdb7titandb11TitanDBImpl12BackgroundGCEPNS_9LogBufferEj\n
at /rust/git/checkouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/librocksdb_sys/libtitan_sys/titan/src/db_impl_gc.cc:237\n   9: _ZN7ro
cksdb7titandb11TitanDBImpl16BackgroundCallGCEv\n             at /rust/git/checkouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/librocks
db_sys/libtitan_sys/titan/src/db_impl_gc.cc:139\n  10: _ZN7rocksdb14ThreadPoolImpl4Impl8BGThreadEm\n             at /rust/git/chec
kouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/librocksdb_sys/rocksdb/util/threadpool_imp.cc:266\n  11: _ZN7rocksdb14ThreadPoolImpl4I
mpl15BGThreadWrapperEPv\n             at /rust/git/checkouts/rust-rocksdb-a9a28e74c6ead8ef/98aea25/librocksdb_sys/rocksdb/util/thr
eadpool_imp.cc:307\n  12: execute_native_thread_routine\n  13: start_thread\n  14: __clone\n"] [location=components/engine_rocks/s
rc/event_listener.rs:66] [thread_name=<unnamed>]

Titan log:

LOG.old.1590593095234019:2020/05/27-21:31:05.600112 7f6b93739480 [lob_file_set.cc:117]
 Blob files for CF 0 found: 234, 233, 231, 230, 229, 226, 225, 224, 223, 221, 219, 215, 216, 214, 
209, 200, 199, 198, 197, 196, 195, 194, 193, 192, 190, 188, 187, 182, 176, 170, 172, 171, 161, 155, 
141, 140, 134, 79, 78, 77, 205, 103, 109, 237, 98, 239, 240, 100, 241, 101, 242, 44, 243, 45, 244, 
104, 201, 245, 51, 247, 53, 107, 248, 108, 249, 55, 254, 253, 266, 267, 69, 268, 105, 246, 71, 270, 
76, 271, 272, 74, 273, 75, 274, 80, 251, 269, 70, 122, 263, 260, 265, 66, 60, 259, 264, 65, 117, 258,
 262, 63, 255, 256, 261, 62, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86, 85, 84, 213, 19, 218, 106,
 9, 208, 13, 212, 227, 33, 232, 217, 23, 222, 238, 39, 102, 5, 257, 58, 29, 36, 99

LOG.old.1590593095234019:2020/05/27-21:31:05.860018 7f6b93739480 [lob_storage.cc:65] Get 1 blob files [256] in the range 
[7A7480000000000000FF335F698000000000FF0000010175736572FF39303236FF383231FF3
335343235FF3730FF363533393600FE00FE, 7A7480000000000000FF335F728000000000FF06A5690000000000FA)

LOG.old.1590593095234019:2020/05/27-21:31:06.126564 7f6b62bff700 [lob_gc_job.cc:530] 
Titan add obsolete file [256] range [7A7480000000000000FF335F728000000000FF03B0A70000000000FAFA36BA00B443FF26,
 7A7480000000000000FF335F728000000000FF06A5680000000000FAFA36B9FD63D3FF84]

LOG.old.1590593095234019:2020/05/27-21:31:06.126582 7f6b62bff700 [ERROR] [dit_collector.h:165] blob file 256 has been deleted already

LOG.old.1590593095234019:2020/05/27-21:31:06.126588 7f6b62bff700 (Original Log Time 
2020/05/27-21:31:05.731084) [lob_gc_job.cc:138] [default] Titan GC candidates[256]

From the log, we can see that Titan GC has already selected the blob file 256 as a candidate. Whereas the DeleteFilesInRange deletes the blob file 256 directly. So when the GC is finished and to delete blob file 256, it may find blob file 256 is already deleted and panic.

Influence:

make tikv panic, but doesn't affect data correctness. After a restart(need to delete panic_mark manually), everything will work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants