Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move dump stats to a separate thread #4382

Closed

Conversation

miasantreble
Copy link
Contributor

@miasantreble miasantreble commented Sep 17, 2018

Currently statistics are supposed to be dumped to info log at intervals of options.stats_dump_period_sec. However the implementation choice was to bind it with compaction thread, meaning if the database has been serving very light traffic, the stats may not get dumped at all.
We decided to separate stats dumping into a new timed thread using TimerQueue, which is already used in blob_db. This will allow us schedule new timed tasks with more deterministic behavior.

Tested with db_bench using --stats_dump_period_sec=20 in command line:

LOG:2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
LOG:2018/09/17-14:08:05.643286 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
LOG:2018/09/17-14:08:25.691325 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
LOG:2018/09/17-14:08:45.740989 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------

LOG content:

2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
2018/09/17-14:07:45.575080 7fe99fbfe700 [WARN] [db/db_impl.cc:606]
** DB Stats **
Uptime(secs): 20.0 total, 20.0 interval
Cumulative writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5.57 GB, 285.01 MB/s
Cumulative WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 GB, 285.01 MB/s
Cumulative stall: 00:00:0.012 H:M:S, 0.1 percent
Interval writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5700.71 MB, 285.01 MB/s
Interval WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 MB, 285.01 MB/s
Interval stall: 00:00:0.012 H:M:S, 0.1 percent
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop

@miasantreble miasantreble changed the title move dump stats in a separate thread move dump stats to a separate thread Sep 17, 2018
db/db_impl.cc Outdated
}
PrintStatistics();

return std::make_pair(true, -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to re-enqueue into the timer queue after printing? Or does the timer queue take care of automatically calling the function the next time timer expires?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by returning make_pair(true, -1) it will automatically re-enqueue, if I understood this comment correctly: https://github.com/facebook/rocksdb/blob/master/utilities/blob_db/blob_db_impl.cc#L1185
I will double check the documentation to confirm this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where requeueing happens: https://github.com/facebook/rocksdb/blob/master/util/timer_queue.h#L186
Basically if the boolean is set to true it will be requeued, otherwise no

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @miasantreble for adding this. I've left a few comments.

db/db_impl.cc Outdated
const uint64_t now_micros = env_->NowMicros();
void DBImpl::StartTimedTasks() {
tqueue_.add(static_cast<int64_t>(
mutable_db_options_.stats_dump_period_sec * 1000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of mutable_db_options_.stats_dump_period_sec is unsigned int. Multiplying an unsigned int with 1000 (32-bit int) may cause overflow before static_cast<int64_t> comes into play. Therefore, I suggest that we use static_cast<int64_t>(stats_dump_period_sec) * 1000 or something similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if stats_dump_period_sec is 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, thanks

db/db_impl.h Outdated
@@ -1509,6 +1513,9 @@ class DBImpl : public DB {
return Env::WLTH_SHORT;
}

// timer based queue to execute tasks
TimerQueue tqueue_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can we use a more meaningful name?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yiwu-arbug
Copy link
Contributor

In fact, I don't suggest we use TimerQueue. The code didn't go through good code review and has weird interface. Unless we are going to run one more round of code review on TimerQueeu, I'm wondering if it is an option if we submit the stat dump job to compaction thread pool. Sorry for making a late request like this.

@sagar0
Copy link
Contributor

sagar0 commented Sep 21, 2018

We can definitely move the dumping to a separate low-priority thread (similar to the ones in compaction thread pool); but still, that doesn't have offer the functionality of running that thread every x seconds. I believe it might be good to re-visit Timers implementation and improve them if we know of any major issues in there (improve the in interface, maybe?).

@yiwu-arbug
Copy link
Contributor

@sagar0 I plan to send a PR to reimplement RepeatableThread from fbcode, which uses CondVar::Wait() to run callback periodically. This one will be simpler to understand because it doesn't have queueing.

Regarding TimerQueue, I doubt if we had even go through legal to check if it is okay to use the code (though from the license statement it looks okay). I didn't go through the code thoroughly to inspect if there's any issue. One thing I want to improve on it is to have it use time methods from Env instead of std::chrono so we can mock the time in tests.

@miasantreble
Copy link
Contributor Author

@yiwu-arbug I will hold this PR and wait for your RepeatableThread PR. Thanks

db/db_impl.cc Outdated

const uint64_t now_micros = env_->NowMicros();
void DBImpl::StartTimedTasks() {
if (mutable_db_options_.stats_dump_period_sec > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hold db mutex when access mutable_db_options_.

db/db_impl.cc Outdated

const uint64_t now_micros = env_->NowMicros();
void DBImpl::StartTimedTasks() {
if (mutable_db_options_.stats_dump_period_sec > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stats_dump_period_sec is dynamic changeable. How do you handle that?

Copy link
Contributor Author

@miasantreble miasantreble Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added logic in SetOptions to cancel and relaunch thread

facebook-github-bot pushed a commit that referenced this pull request Sep 27, 2018
Summary:
Introduce `RepeatableThread` utility to run task periodically in a separate thread. It is basically the same as the the same class in fbcode, and in addition provide a helper method to let tests mock time and trigger execution one at a time.

We can use this class to replace `TimerQueue` in #4382 and `BlobDB`.
Pull Request resolved: #4423

Differential Revision: D10020932

Pulled By: yiwu-arbug

fbshipit-source-id: 3616bef108c39a33c92eedb1256de424b7c04087
@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

db/db_impl.cc Outdated
}
thread_dump_stats_ = new rocksdb::RepeatableThread(
[this]() { DBImpl::MaybeDumpStats(); }, "dump_st", env_,
new_options.stats_dump_period_sec * 1000000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check new_options.stats_dump_period_sec > 0.

db/db_impl.cc Outdated
// atomically. We could see more than one dump during one dump
// period in rare cases.
last_stats_dump_time_microsec_ = now_micros;
void DBImpl::MaybeDumpStats() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/MaybeDumpStats/DumpStats ?

assert(db_property_info != nullptr);

std::string stats;
if (shutdown_initiated_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shutdown_initiated_ is not an atomic, and I believe accessing it also require holding db mutex. Maybe you want to turn it into an atomic?

db/db_impl.h Outdated
@@ -1468,6 +1472,8 @@ class DBImpl : public DB {
// Only to be set during initialization
std::unique_ptr<PreReleaseCallback> recoverable_state_pre_release_callback_;

rocksdb::RepeatableThread* thread_dump_stats_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a unique_ptr, and always hold db mutex when setting/unsetting it.

@yiwu-arbug
Copy link
Contributor

Do you want to include a test with the change? :)

@miasantreble
Copy link
Contributor Author

miasantreble commented Oct 7, 2018

@yiwu-arbug I'm wondering if DB::Open() covers all cases where stats dumping is needed, in other words, is there other code path that doesn't go through DB::Open() but RocksDB may still be collecting stats?
About tests, I didn't find any existing test coverage for stats dumping so it's reasonable to ask for new test coverage. However I'm not sure if we have test infrastructure to verify what's being written to info logs. I did run db_bench to manually verify though

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yiwu-arbug
Copy link
Contributor

@miasantreble I believe it is safe to assume DB has to go through Open() before collecting stats.

For test, how about just a sync point inside DumpStats and check it is triggered? If you have extra bandwidth, mocking Logger should not be difficult either.

Copy link
Contributor

@yiwu-arbug yiwu-arbug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Thanks!

db/db_impl.cc Outdated
@@ -762,6 +771,18 @@ Status DBImpl::SetDBOptions(
new_options.max_background_compactions, Env::Priority::LOW);
MaybeScheduleFlushOrCompaction();
}
if (new_options.stats_dump_period_sec > 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancel thread_dump_stats_ if new_options.stats_dump_period == 0?

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

db/db_impl.cc Outdated
@@ -771,7 +773,15 @@ Status DBImpl::SetDBOptions(
new_options.max_background_compactions, Env::Priority::LOW);
MaybeScheduleFlushOrCompaction();
}
if (new_options.stats_dump_period_sec > 0 &&
if (new_options.stats_dump_period_sec == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The logic can be simplified as

if (new_options.xxx != existing_options.xxx) {
  if (thread_dump_stats_ != nullptr) {
    cancel_thread();
  }
  if (new_options.xxx > 0) {
    start_thread();
  }
}

rocksdb::SyncPoint::GetInstance()->EnableProcessing();
Reopen(options);
ASSERT_EQ(5, dbfull()->GetDBOptions().stats_dump_period_sec);
env_->SleepForMicroseconds(6000000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be flaky to rely on sleep. Do you consider either use a cond var like the RepeatableThreadTest::TimedTest, or use MockTimeEnv like RepeatableThreadTest::MockEnvTest to make the test more deterministic? :)
https://github.com/facebook/rocksdb/blob/master/util/repeatable_thread_test.cc#L22

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@miasantreble has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@miasantreble miasantreble deleted the dump-stats-separate-thread branch October 9, 2018 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants