-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mempurge instead of flush is initiated even if only one memtable is picked by flush job #9151
Labels
bug
Confirmed RocksDB bugs
Comments
Should it be assigned or up-for-grabs? |
facebook-github-bot
pushed a commit
that referenced
this issue
Nov 19, 2021
Summary: After RocksDB 6.19 and before this PR, RocksDB FlushJob may pick more memtables to flush beyond synced WALs. This can be problematic if there are multiple column families, since it can prematurely advance the flushed column family's log_number. Should subsequent attempts fail to sync the latest WALs and the database goes through a recovery, it may detect corrupted WAL number below the flushed column family's log number and complain about column family inconsistency. To fix, we record the maximum memtable ID of the column family being flushed. Then we call SyncClosedLogs() so that all closed WALs at the time when memtable ID is recorded will be synced. I also disabled a unit test temporarily due to reasons described in #9151 Pull Request resolved: #9142 Test Plan: make check Reviewed By: ajkr Differential Revision: D32299956 Pulled By: riversand963 fbshipit-source-id: 0da75888177d91905cf8c9d00605b73afb5970a7
@ajkr I'm happy to get this assigned to me |
@ajkr With regards to this bug, I have 2 questions:
|
Sorry I'm not currently familiar with mempurge. @riversand963 are you able to help answer the questions? |
ywave620
pushed a commit
to ywave620/rocksdb
that referenced
this issue
Dec 28, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This can be a potential bug once we merge #9142, which is a fix for a legit bug causing DB::Open failure. Currently before the fix, this bug is hidden.
In https://github.com/facebook/rocksdb/blob/6.26.fb/db/flush_job.cc#L233, a flush job will initiate a mempurge instead of flush even if
mems_.size()
is 1. Consequently, this flush job does not reduce the number of immutable memtables, leading to higher chance of write stall.Expected behavior
When the number of immutable memtables reaches threshold, a flush is scheduled and executed, resulting in reduced number of immutable memtables. The db will eventually get out of write-stall, even when there are a lot of writes.
Actual behavior
Currently, when the number of immutable reaches threshold, a mempurge may be scheduled even if the number of memtables picked is 1. The new memtable will be added back, and does not mitigate write-stall condition. No further flush may be scheduled because normally a flush is scheduled after insertion, but insertion is currently stalled.
Steps to reproduce the behavior
Use #9150 , restart the job "build-linux-non-shm-1" with ssh access. Manually run the following
It will hang.
The text was updated successfully, but these errors were encountered: