Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break stalls when no bg work is happening #1884

Closed
wants to merge 5 commits into from

Conversation

IslamAbdelRahman
Copy link
Contributor

Current stall will keep sleeping even if there is no Flush/Compactions to wait for, I changed the logic to break the stall if we are not flushing or compacting

db_bench command used

# fillrandom
# memtable size = 10MB
# value size = 1 MB
# num = 1000
# use /dev/shm
./db_bench --benchmarks="fillrandom,stats" --value_size=1048576 --write_buffer_size=10485760 --num=1000 --delayed_write_rate=XXXXX  --db="/dev/shm/new_stall" | grep "Cumulative stall"
Current results

# delayed_write_rate = 1000 Kb/sec
Cumulative stall: 00:00:9.031 H:M:S

# delayed_write_rate = 200 Kb/sec
Cumulative stall: 00:00:22.314 H:M:S

# delayed_write_rate = 100 Kb/sec
Cumulative stall: 00:00:42.784 H:M:S

# delayed_write_rate = 50 Kb/sec
Cumulative stall: 00:01:23.785 H:M:S

# delayed_write_rate = 25 Kb/sec
Cumulative stall: 00:02:45.702 H:M:S
New results

# delayed_write_rate = 1000 Kb/sec
Cumulative stall: 00:00:9.017 H:M:S

# delayed_write_rate = 200 Kb/sec
Cumulative stall: 00:00:11.530 H:M:S

# delayed_write_rate = 100 Kb/sec
Cumulative stall: 00:00:11.977 H:M:S

# delayed_write_rate = 50 Kb/sec
Cumulative stall: 00:00:11.380 H:M:S

# delayed_write_rate = 25 Kb/sec
Cumulative stall: 00:00:12.182 H:M:S

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman updated the pull request - view changes - changes since last import

Copy link
Contributor

@yiwu-arbug yiwu-arbug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

db/db_impl.cc Outdated
}

delayed = true;
bg_cv_.TimedWait(static_cast<int>(stall_end - now));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still sleep a small amount of time? One reason I'm wary is to add one another use of TimedWait(). I recently realize that there may be corner case that a clock change may cause the wait not accurate. I think a shorter sleep will make it easier to read too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying, I think you mean there is a problem with NowMicros() not TimedWait(), correct ?
NowMicros() is the one that will be affected by system time change, correct ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IslamAbdelRahman no I mean TimedWait(). I suspect inside pthread's wait until call, I think it calls non-monotonic time to determine duration to wait for.

db/db_impl.cc Outdated
// We will delay the write until the wall clock reach stall_end or
// we don't have any flushes or compactions running in the bg
uint64_t stall_end = sw.start_time() + delay;
while (bg_flush_scheduled_ > 0 || bg_compaction_scheduled_ > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason not callling write_controller_.NeedsDelay()?

@IslamAbdelRahman
Copy link
Contributor Author

Hey @siying, I think you missed my last comment here

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman updated the pull request - view changes - changes since last import

@IslamAbdelRahman
Copy link
Contributor Author

Adress @siying comments

  • Use NeedsDelay() and make it atomic to avoid locking/unlocking the mutex during the delay
  • Use monotonic clock NowNanos()

db/db_impl.cc Outdated
const uint64_t kDelayInterval = 10000;
uint64_t stall_end = stall_start + (delay * std::milli::den);
while (write_controller_.NeedsDelay()) {
if (env_->NowNanos() >= stall_end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean we need to call a monotonic clock here. I just want to avoid TimedWait() if possible. But what you are doing here is still correct.

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@IslamAbdelRahman
Copy link
Contributor Author

trying to figure out why the test is failing now

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@IslamAbdelRahman has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants