-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary big for-loop when reporting ticker stats stored in GetContext #3490
Avoid unnecessary big for-loop when reporting ticker stats stored in GetContext #3490
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Can you include the benchmark command and amount of CPU savings in the PR description? |
db/version_set.cc
Outdated
for (uint32_t t = 0; t < Tickers::TICKER_ENUM_MAX; t++) { | ||
if (get_context.tickers_value[t] > 0) { | ||
RecordTick(db_statistics_, t, get_context.tickers_value[t]); | ||
for (auto ticker_pair : get_context.tickers_pairs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap this with a if (db_statistics_ != nullptr)
db/version_set.cc
Outdated
for (uint32_t t = 0; t < Tickers::TICKER_ENUM_MAX; t++) { | ||
if (get_context.tickers_value[t] > 0) { | ||
RecordTick(db_statistics_, t, get_context.tickers_value[t]); | ||
for (auto ticker_pair : get_context.tickers_pairs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. No need to do this for loop if db_statistics_
is null.
table/get_context.h
Outdated
@@ -27,7 +27,7 @@ class GetContext { | |||
kMerge, // saver contains the current merge result (the operands) | |||
kBlobIndex, | |||
}; | |||
uint64_t tickers_value[Tickers::TICKER_ENUM_MAX] = {0}; | |||
autovector<std::pair<Tickers, uint64_t>> tickers_pairs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you figure out how many entries we need and set it up in autovector?
@miasantreble has updated the pull request. View: changes, changes since last import |
table/get_context.cc
Outdated
} | ||
|
||
void GetContext::InitTickers() { | ||
ticker_pairs_.push_back(std::make_pair<Tickers, uint64_t>(BLOCK_CACHE_HIT, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks kind of confusing and hard to maintain. Does it give any benefit?
table/get_context.cc
Outdated
@@ -88,10 +89,33 @@ void GetContext::SaveValue(const Slice& value, SequenceNumber seq) { | |||
} | |||
|
|||
void GetContext::RecordCounters(Tickers ticker, size_t val) { | |||
if (ticker == Tickers::TICKER_ENUM_MAX) { | |||
return; | |||
for (auto it = ticker_pairs_.begin(); it != ticker_pairs_.end(); ++it) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have intuition why this PR is faster overall -- before we had an array lookup here, and now it's a for-loop. Have you measured it? Make sure to use a fairly large DB as then a single Get() will have to execute this for-loop many times.
table/get_context.cc
Outdated
ticker_pairs_.push_back(std::make_pair<Tickers, uint64_t>(BLOCK_CACHE_DATA_ADD, 0)); | ||
ticker_pairs_.push_back(std::make_pair<Tickers, uint64_t>(BLOCK_CACHE_DATA_BYTES_INSERT, 0)); | ||
ticker_pairs_.push_back(std::make_pair<Tickers, uint64_t>(BLOCK_CACHE_FILTER_ADD, 0)); | ||
ticker_pairs_.push_back(std::make_pair<Tickers, uint64_t>(BLOCK_CACHE_FILTER_BYTES_INSERT, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't realize the list is that long. How about we make a GetStats object and make them all variables? We already have things like this: https://github.com/facebook/rocksdb/blob/5.11.fb/db/compaction_iteration_stats.h and https://github.com/facebook/rocksdb/blob/5.11.fb/db/internal_stats.h#L133-L161 which don't feel very hard to maintain.
|
0ddb9bd
to
f1ec4ba
Compare
@miasantreble has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Is the correct conclusion from the numbers that ReportCounters (this branch) is faster than RecordCounters (master) in the case where the database is small; otherwise it's slower? |
f1ec4ba
to
3fd4f3e
Compare
@miasantreble has updated the pull request. Re-import the pull request |
db_bench command used:
So looks like the new approach is faster for 1M keys + 1M block size case. Tried to run 5M keys + 1M block size but the perf.data becomes too large to open (7GB) |
The approach is good but where is autovector? |
@miasantreble has updated the pull request. Re-import the pull request |
Can you improve your title and summary? It's not clear what "counters" you mean. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miasantreble is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…GetContext (facebook#3490) Summary: Currently in `Version::Get` when reporting ticker stats stored in `GetContext`, there is a big for-loop through all `Ticker` which adds unnecessary cost to overall CPU usage. We can optimize by storing only ticker values that are used in `Get()` calls in a new struct `GetContextStats` since only a small fraction of all tickers are used in `Get()` calls. For comparison, with the new approach we only need to visit 17 values while old approach will require visiting 100+ `Ticker` Pull Request resolved: facebook#3490 Differential Revision: D6969154 Pulled By: miasantreble fbshipit-source-id: fc27072965a3a94125a3e6883d20dafcf5b84029
Currently in
Version::Get
when reporting ticker stats stored inGetContext
, there is a big for-loop through allTicker
which adds unnecessary cost to overall CPU usage. We can optimize by storing only ticker values that are used inGet()
calls in a new structGetContextStats
since only a small fraction of all tickers are used inGet()
calls. For comparison, with the new approach we only need to visit 17 values while old approach will require visiting 100+Ticker