log recycling is trying to reuse the deleted log file #1303

somnathr · 2016-08-28T14:32:21Z

Hi,
This is Somnath and I am a Ceph developer working on integrating rocksdb as a metadata store for Ceph's upcoming new backend. In the process we found this bug within rocksdb. If I enable log recycling with the rocksdb (recycle_log_file_num=16) after ~5-6 hours of run (1M seq but can happen to any workload I guess)) I am getting an error during db->WriteBatch(). Digging down further I got the following anomaly in the rocksdb log.

It seems the bug is around this portion.

2016-08-25 00:44:03.348710 7f7c117ff700 4 rocksdb: adding log 254 to recycle list

2016-08-25 00:44:03.348722 7f7c117ff700 4 rocksdb: adding log 256 to recycle list

2016-08-25 00:44:03.348725 7f7c117ff700 4 rocksdb: (Original Log Time 2016/08/25-00:44:03.347467) [default] Level-0 commit table #258 started
2016-08-25 00:44:03.348727 7f7c117ff700 4 rocksdb: (Original Log Time 2016/08/25-00:44:03.348225) [default] Level-0 commit table #258: memtable #1 done
2016-08-25 00:44:03.348729 7f7c117ff700 4 rocksdb: (Original Log Time 2016/08/25-00:44:03.348227) [default] Level-0 commit table #258: memtable #2 done
2016-08-25 00:44:03.348730 7f7c117ff700 4 rocksdb: (Original Log Time 2016/08/25-00:44:03.348239) EVENT_LOG_v1 {"time_micros": 1472111043348233, "job": 88, "event": "flush_finished", "lsm_state": [3, 4, 0, 0, 0, 0, 0], "immutable_memtables": 0}
2016-08-25 00:44:03.348735 7f7c117ff700 4 rocksdb: (Original Log Time 2016/08/25-00:44:03.348297) [default] Level summary: base level 1 max bytes base 5368709120 files[3 4 0 0 0 0 0] max score 0.75

2016-08-25 00:44:03.348751 7f7c117ff700 4 rocksdb: [JOB 88] Try to delete WAL files size 131512372, prev total WAL file size 131834601, number of live WAL files 3.

2016-08-25 00:44:03.348761 7f7c117ff700 10 bluefs unlink db.wal/000256.log
2016-08-25 00:44:03.348766 7f7c117ff700 20 bluefs _drop_link had refs 1 on file(ino 19 size 0x3f3ddd9 mtime 2016-08-25 00:41:26.298423 bdev 0 extents [0:0xc500000+200000,0:0xcb00000+800000,0:0xd700000+700000,0:0xe200000+800000,0:0xee00000+800000,0:0xfa00000+800000,0:0x10600000+700000,0:0x11100000+700000,0:0x11c00000+800000,0:0x12800000+100000])
2016-08-25 00:44:03.348775 7f7c117ff700 20 bluefs _drop_link destroying file(ino 19 size 0x3f3ddd9 mtime 2016-08-25 00:41:26.298423 bdev 0 extents [0:0xc500000+200000,0:0xcb00000+800000,0:0xd700000+700000,0:0xe200000+800000,0:0xee00000+800000,0:0xfa00000+800000,0:0x10600000+700000,0:0x11100000+700000,0:0x11c00000+800000,0:0x12800000+100000])
2016-08-25 00:44:03.348794 7f7c117ff700 10 bluefs unlink db.wal/000254.log
2016-08-25 00:44:03.348796 7f7c117ff700 20 bluefs _drop_link had refs 1 on file(ino 18 size 0x3f3d402 mtime 2016-08-25 00:41:26.299110 bdev 0 extents [0:0x6500000+400000,0:0x6d00000+700000,0:0x7800000+800000,0:0x8400000+800000,0:0x9000000+800000,0:0x9c00000+700000,0:0xa700000+900000,0:0xb400000+800000,0:0xc000000+500000])
2016-08-25 00:44:03.348803 7f7c117ff700 20 bluefs _drop_link destroying file(ino 18 size 0x3f3d402 mtime 2016-08-25 00:41:26.299110 bdev 0 extents [0:0x6500000+400000,0:0x6d00000+700000,0:0x7800000+800000,0:0x8400000+800000,0:0x9000000+800000,0:0x9c00000+700000,0:0xa700000+900000,0:0xb400000+800000,0:0xc000000+500000])

bluefs part of the log is within ceph.
So, log 254 is added to the recycle list and at the same time it is added for deletion. The following log entry is in response to rocksdb delete call.
2016-08-25 00:44:03.348794 7f7c117ff700 10 bluefs unlink db.wal/000254.log

It seems there is a race condition in this portion (?).

I was going through the rocksdb code and I found the following.

DBImpl::FindObsoleteFiles is the one that is responsible for populating log_recycle_files and log_delete_files. It is also deleting entries from alive_log_files_. But, this is always under mutex_ lock.
Log is deleted from DBImpl::DeleteObsoleteFileImpl which is not under lock , but iterating over log_delete_files. This is fishy but it shouldn't be the reason for same log number end up in two list.
I saw all the places but the following place alive_log_files_ (within DBImpl::WriteImpl) is accessed without lock.

4625 alive_log_files_.back().AddSize(log_entry.size());

Can it be reintroducing the same log number (254) , I am not sure.

If I run disabling rocksdb log recyling it is going through fine for me, no db error.

Disabling log recycling is probably not an option for us as I am seeing it is introducing spikes and the output is not very stable.

Appreciate any help resolving this issue.

Thanks & Regards
Somnath

If log recycling is enabled with the rocksdb (recycle_log_file_num=16) db->Writebatch is erroring out with keynotfound after ~5-6 hours of run (1M seq but can happen to any workload I guess).See my detailed bug report here (facebook#1303). This commit is the fix for this, a check is been added not to delete the log file if it is already there in the recycle list. Test Plan: Unit tested it and ran the similar profile. Not reproducing anymore.

If log recycling is enabled with the rocksdb (recycle_log_file_num=16) db->Writebatch is erroring out with keynotfound after ~5-6 hours of run (1M seq but can happen to any workload I guess).See my detailed bug report here (#1303). This commit is the fix for this, a check is been added not to delete the log file if it is already there in the recycle list. Test Plan: Unit tested it and ran the similar profile. Not reproducing anymore.

gfosco · 2018-01-10T22:18:11Z

Closing this via automation due to lack of activity. If discussion is still needed here, please re-open or create a new/updated issue.

somnathr mentioned this issue Sep 1, 2016

Bug fix on log recycle #1313

Merged

gfosco added the abandoned-or-aged-out label Jan 10, 2018

gfosco closed this as completed Jan 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

log recycling is trying to reuse the deleted log file #1303

log recycling is trying to reuse the deleted log file #1303

somnathr commented Aug 28, 2016

gfosco commented Jan 10, 2018

log recycling is trying to reuse the deleted log file #1303

log recycling is trying to reuse the deleted log file #1303

Comments

somnathr commented Aug 28, 2016

gfosco commented Jan 10, 2018