-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample number of reads per SST file #2417
Conversation
@siying updated the pull request - view changes |
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nits. Looks good to me otherwise.
db/version_set.cc
Outdated
: icmp_(icmp), | ||
flevel_(flevel), | ||
index_(static_cast<uint32_t>(flevel->num_files)), | ||
current_value_(0, 0, 0) { // Marks as invalid | ||
current_value_(0, 0, 0), | ||
should_sample_(should_sample) { // Marks as invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment "Marks as invalid" should be against current_value_ (i.e. the FileDescriptor instance).
db/version_set.cc
Outdated
@@ -122,7 +121,7 @@ class FilePicker { | |||
} | |||
} | |||
|
|||
int GetCurrentLevel() { return returned_file_level_; } | |||
int curr_level() const { return curr_level_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name should still be GetCurrentLevel following the convention of other function names in this class, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I'll revert it to avoid confusion. In the next TechDebt Week, maybe we should clear up these functions to follow the convention mentioned in Google C++ Style Guide: https://google.github.io/styleguide/cppguide.html#Function_Names . Now we have both cases in the code base.
#include "util/random.h" | ||
|
||
namespace rocksdb { | ||
static const uint32_t kFileReadSampleRate = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering out loud, if there are any cases in which we want to make the sampling rate configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. We can make it configurable once there is such a case. So far I would keep it hard-coded to avoid yet another option.
Summary:We estimate number of reads per SST files, by updating the counter per file in sampled read requests. This information can later be used to trigger compactions to improve read performacne. Test Plan: Load the DB with ./db_bench --benchmarks=fillrandom,sstables --write_buffer_size=2000000 --num=3000000 And then observe stats in outputs of: ./db_bench --benchmarks=readrandom,sstables --write_buffer_size=2000000 --num=3000000 --threads=8 and ./db_bench --benchmarks=seekrandom,sstables --write_buffer_size=2000000 --num=3000000 --threads=8
@siying updated the pull request - view changes - changes since last import |
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary:We estimate number of reads per SST files, by updating the counter per file in sampled read requests. This information can later be used to trigger compactions to improve read performacne.
Test Plan:
Load the DB with
./db_bench --benchmarks=fillrandom,sstables --write_buffer_size=2000000 --num=3000000
And then observe stats in outputs of:
./db_bench --benchmarks=readrandom,sstables --write_buffer_size=2000000 --num=3000000 --threads=8
and
./db_bench --benchmarks=seekrandom,sstables --write_buffer_size=2000000 --num=3000000 --threads=8