Fix interference between max_total_wal_size and db_write_buffer_size checks #1893

al13n321 · 2017-02-21T20:18:05Z

This is a trivial fix for OOMs we've seen a few days ago in logdevice.

RocksDB get into the following state:
(1) Write throughput is too high for flushes to keep up. Compactions are out of the picture - automatic compactions are disabled, and for manual compactions we don't care that much if they fall behind. We write to many CFs, with only a few L0 sst files in each, so compactions are not needed most of the time.
(2) total_log_size_ is consistently greater than GetMaxTotalWalSize(). It doesn't get smaller since flushes are falling ever further behind.
(3) Total size of memtables is way above db_write_buffer_size and keeps growing. But the write_buffer_manager_->ShouldFlush() is not checked because (2) prevents it (for no good reason, afaict; this is what this commit fixes).
(4) Every call to WriteImpl() hits the MaybeFlushColumnFamilies() path. This keeps flushing the memtables one by one in order of increasing log file number.
(5) No write stalling trigger is hit. We rely on max_write_buffer_number to stall writes when flushes can't keep up, but (3) prevents it from kicking in. Memtables keep piling up in memory until we run OOM.

…e checks

facebook-github-bot · 2017-02-21T21:54:42Z

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

yiwu-arbug

Thanks for the fix! Just try to fully understand the bug, why (4) didn't ultimately flush one of the large memtable which should resolves the situation?

al13n321 · 2017-02-22T20:41:02Z

why (4) didn't ultimately flush one of the large memtable which should resolves the situation?

At the time of sending the PR I thought that it was because flushes couldn't keep up (i.e. (4) kept flushing memtables one by one, but new writes kept coming, creating new memtables faster than (4) flushed them). This seems like a plausible scenario, worth fixing.

But apparently that's not what was happening there. Instead, a MaybeFlushColumnFamilies() call has set alive_log_files_.begin()->getting_flushed to true and requested some flushes. Completion of the requested flushes was supposed to make the oldest file obsolete and remove it from the alive_log_files_ list. But that never happened, I have no idea why; investigating. A few hours later we got an OOM.

yiwu-arbug · 2017-02-22T21:19:01Z

make sense. when alive_log_files_.begin()->getting_flushed is true MaybeFlushColumnFamilies() is short-circuit as well: https://fburl.com/ykmm3z3s

al13n321 · 2017-02-23T20:48:54Z

But apparently that's not what was happening there.

I take that back after looking a bit more at the logs. Both this and #1903 were happening; most of the OOMs followed the scenario from this PR.

Fixed interference between max_total_wal_size and db_write_buffer_siz…

8db2ed5

…e checks

al13n321 requested a review from IslamAbdelRahman February 21, 2017 20:18

facebook-github-bot added the CLA Signed label Feb 21, 2017

yiwu-arbug approved these changes Feb 21, 2017

View reviewed changes

facebook-github-bot closed this in 18eeb7b Feb 22, 2017

al13n321 mentioned this pull request Feb 22, 2017

Fixed a bug in DBImpl::MaybeFlushColumnFamilies() #1903

Closed

al13n321 deleted the fl branch February 23, 2017 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix interference between max_total_wal_size and db_write_buffer_size checks #1893

Fix interference between max_total_wal_size and db_write_buffer_size checks #1893

al13n321 commented Feb 21, 2017

facebook-github-bot commented Feb 21, 2017

yiwu-arbug left a comment •

edited

Loading

al13n321 commented Feb 22, 2017

yiwu-arbug commented Feb 22, 2017

al13n321 commented Feb 23, 2017

Fix interference between max_total_wal_size and db_write_buffer_size checks #1893

Fix interference between max_total_wal_size and db_write_buffer_size checks #1893

Conversation

al13n321 commented Feb 21, 2017

facebook-github-bot commented Feb 21, 2017

yiwu-arbug left a comment • edited Loading

Choose a reason for hiding this comment

al13n321 commented Feb 22, 2017

yiwu-arbug commented Feb 22, 2017

al13n321 commented Feb 23, 2017

yiwu-arbug left a comment •

edited

Loading