Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TServer memory monitor could never pickup old inactive tablet to flush #1672

Closed
ttyusupov opened this issue Jul 2, 2019 · 0 comments
Closed
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug

Comments

@ttyusupov
Copy link
Contributor

ttyusupov commented Jul 2, 2019

TSTabletManager has a global memory monitor which runs flushes on tablets when rocksdb memory usage exceeds global_memstore_size_mb_max (or global_memstore_size_percentage in case global_memstore_size_mb_max is not set).
The tablet to flush in this case is selected based on the oldest write to memstore. For this reason, each tablet has a TabletFlushStatsinstance which is monitoring writes and flushes and tracking oldest_write_in_memstore. TabletFlushStats::OnFlushScheduled resets oldest_write_in_memstoreto max value, this was supposed to prevent empty tablets from participating in selection for flush. But in reality, TabletFlushStats::OnFlushScheduled is called from DBImpl::SchedulePendingFlush even if there is no pending flush to schedule. In its turn, DBImpl::SchedulePendingFlush could be called after compaction, rocksdb opening, deleting obsolete SST files. This will reset oldest_write_in_memstore to max value for the tablet and if there are no further writes to tablet - the tablet will never be picked for flush by tserver memory monitor.

@ttyusupov ttyusupov self-assigned this Jul 2, 2019
@ttyusupov ttyusupov added this to To Do in YBase features via automation Jul 2, 2019
@rkarthik007 rkarthik007 added area/docdb YugabyteDB core features kind/bug This issue is a bug labels Jul 2, 2019
yugabyte-ci pushed a commit that referenced this issue Jul 4, 2019
Summary:
`TSTabletManager` has a global memory monitor which runs flushes on tablets when rocksdb memory usage exceeds `global_memstore_size_mb_max` (or `global_memstore_size_percentage` in case `global_memstore_size_mb_max` is not set).
The tablet to flush in this case is selected based on the oldest write to memstore. For this reason, each tablet has a `TabletFlushStats`instance which is monitoring writes and flushes and tracking `oldest_write_in_memstore`. `TabletFlushStats::OnFlushScheduled` resets `oldest_write_in_memstore`to max value, this was supposed to prevent empty tablets from participating in selection for flush. But in reality, `TabletFlushStats::OnFlushScheduled` is called from `DBImpl::SchedulePendingFlush` even if there is no pending flush to schedule. In its turn, `DBImpl::SchedulePendingFlush` could be called after compaction, rocksdb opening, deleting obsolete SST files. This will reset `oldest_write_in_memstore` to max value for the tablet and if there are no further writes to tablet - the tablet will never be picked for flush by tserver memory monitor.

The fix is to use mem table frontiers to get oldest hybrid time written in memtable and do flush based on that.

Test Plan: Jenkins

Reviewers: amitanand, mikhail, sergei

Reviewed By: sergei

Subscribers: kannan, bogdan, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D6846
YBase features automation moved this from To Do to Done Jul 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug
Projects
Development

No branches or pull requests

2 participants