You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TSTabletManager has a global memory monitor which runs flushes on tablets when rocksdb memory usage exceeds global_memstore_size_mb_max (or global_memstore_size_percentage in case global_memstore_size_mb_max is not set).
The tablet to flush in this case is selected based on the oldest write to memstore. For this reason, each tablet has a TabletFlushStatsinstance which is monitoring writes and flushes and tracking oldest_write_in_memstore. TabletFlushStats::OnFlushScheduled resets oldest_write_in_memstoreto max value, this was supposed to prevent empty tablets from participating in selection for flush. But in reality, TabletFlushStats::OnFlushScheduled is called from DBImpl::SchedulePendingFlush even if there is no pending flush to schedule. In its turn, DBImpl::SchedulePendingFlush could be called after compaction, rocksdb opening, deleting obsolete SST files. This will reset oldest_write_in_memstore to max value for the tablet and if there are no further writes to tablet - the tablet will never be picked for flush by tserver memory monitor.
The text was updated successfully, but these errors were encountered:
Summary:
`TSTabletManager` has a global memory monitor which runs flushes on tablets when rocksdb memory usage exceeds `global_memstore_size_mb_max` (or `global_memstore_size_percentage` in case `global_memstore_size_mb_max` is not set).
The tablet to flush in this case is selected based on the oldest write to memstore. For this reason, each tablet has a `TabletFlushStats`instance which is monitoring writes and flushes and tracking `oldest_write_in_memstore`. `TabletFlushStats::OnFlushScheduled` resets `oldest_write_in_memstore`to max value, this was supposed to prevent empty tablets from participating in selection for flush. But in reality, `TabletFlushStats::OnFlushScheduled` is called from `DBImpl::SchedulePendingFlush` even if there is no pending flush to schedule. In its turn, `DBImpl::SchedulePendingFlush` could be called after compaction, rocksdb opening, deleting obsolete SST files. This will reset `oldest_write_in_memstore` to max value for the tablet and if there are no further writes to tablet - the tablet will never be picked for flush by tserver memory monitor.
The fix is to use mem table frontiers to get oldest hybrid time written in memtable and do flush based on that.
Test Plan: Jenkins
Reviewers: amitanand, mikhail, sergei
Reviewed By: sergei
Subscribers: kannan, bogdan, ybase
Differential Revision: https://phabricator.dev.yugabyte.com/D6846
TSTabletManager
has a global memory monitor which runs flushes on tablets when rocksdb memory usage exceedsglobal_memstore_size_mb_max
(orglobal_memstore_size_percentage
in caseglobal_memstore_size_mb_max
is not set).The tablet to flush in this case is selected based on the oldest write to memstore. For this reason, each tablet has a
TabletFlushStats
instance which is monitoring writes and flushes and trackingoldest_write_in_memstore
.TabletFlushStats::OnFlushScheduled
resetsoldest_write_in_memstore
to max value, this was supposed to prevent empty tablets from participating in selection for flush. But in reality,TabletFlushStats::OnFlushScheduled
is called fromDBImpl::SchedulePendingFlush
even if there is no pending flush to schedule. In its turn,DBImpl::SchedulePendingFlush
could be called after compaction, rocksdb opening, deleting obsolete SST files. This will resetoldest_write_in_memstore
to max value for the tablet and if there are no further writes to tablet - the tablet will never be picked for flush by tserver memory monitor.The text was updated successfully, but these errors were encountered: