-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PerfContextByLevel to provide per level perf context information #4226
Conversation
db/db_impl_open.cc
Outdated
} | ||
} | ||
// TODO(Zhongyi): need to initialize this in all threads that might need | ||
// access to PerfContextByLevel | ||
perf_context.EnablePerLevelPerfContext(max_num_levels); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong place to do the thing. Open() may not be done in the same thread as query.
And we cannot initialize for all threads. It's bad. People need to enable it their for their thread.
Also be aware that perf_context is not per DB or per CF. It's shared across all DBs and all CFs. Getting max level for one CF and allocate based on it is not a good idea. Maybe we should consider to allocate a fixed number, say 16. All levels 15 or up fall into level 15. Or maybe we can even think of making it std::map from level to struct.
monitoring/perf_context.cc
Outdated
return ss.str(); | ||
#endif | ||
} | ||
|
||
void PerfContext::EnablePerLevelPerfContext(int levels){ | ||
num_levels = levels; | ||
perf_context_by_level = new PerfContextByLevel[levels]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should support that people call EnablePerLevelPerfContext() multiple times. So if perf_context_by_level is not null, we shouldn't allocate again.
monitoring/perf_context.cc
Outdated
|
||
void PerfContext::DisablePerLevelPerfContext(){ | ||
per_level_perf_context_enabled = false; | ||
delete perf_context_by_level; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maybe expensive. Consider to have disable and clear up as two separate functions.
db/compaction_job.cc
Outdated
@@ -811,6 +811,7 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) { | |||
if (measure_io_stats_) { | |||
prev_perf_level = GetPerfLevel(); | |||
SetPerfLevel(PerfLevel::kEnableTime); | |||
perf_context.EnablePerLevelPerfContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need it. Is there per-level perf context counters we report for compactions.
db/flush_job.cc
Outdated
@@ -207,6 +207,7 @@ Status FlushJob::Run(LogsWithPrepTracker* prep_tracker, | |||
if (measure_io_stats_) { | |||
prev_perf_level = GetPerfLevel(); | |||
SetPerfLevel(PerfLevel::kEnableTime); | |||
perf_context.EnablePerLevelPerfContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Why do we need this?
include/rocksdb/perf_context.h
Outdated
@@ -5,6 +5,7 @@ | |||
|
|||
#ifndef STORAGE_ROCKSDB_INCLUDE_PERF_CONTEXT_H | |||
#define STORAGE_ROCKSDB_INCLUDE_PERF_CONTEXT_H | |||
#define MAX_PERF_CONTEXT_LEVELS 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is public API. We should avoid define like this.
include/rocksdb/perf_context.h
Outdated
@@ -169,6 +189,8 @@ struct PerfContext { | |||
uint64_t env_lock_file_nanos; | |||
uint64_t env_unlock_file_nanos; | |||
uint64_t env_new_logger_nanos; | |||
PerfContextByLevel* perf_context_by_level; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We mentioned that we can consider to use std::map to support all levels. What's your thoughts on this approach?
monitoring/perf_context.cc
Outdated
@@ -186,8 +204,30 @@ std::string PerfContext::ToString(bool exclude_zero_counters) const { | |||
PERF_CONTEXT_OUTPUT(env_lock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_unlock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_new_logger_nanos); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT(bloom_filter_useful); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT(bloom_filter_full_positive); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT(bloom_filter_full_true_positive); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The printing format feels better if all the counters for one level is printed together.
6325c0e
to
09442d0
Compare
monitoring/perf_context.cc
Outdated
@@ -104,6 +104,14 @@ void PerfContext::Reset() { | |||
env_lock_file_nanos = 0; | |||
env_unlock_file_nanos = 0; | |||
env_new_logger_nanos = 0; | |||
if (per_level_perf_context_enabled && level_to_perf_context) { | |||
for (auto iter = level_to_perf_context->begin(); iter != level_to_perf_context->end(); ++iter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is over 80-char.
monitoring/perf_context.cc
Outdated
@@ -104,6 +104,14 @@ void PerfContext::Reset() { | |||
env_lock_file_nanos = 0; | |||
env_unlock_file_nanos = 0; | |||
env_new_logger_nanos = 0; | |||
if (per_level_perf_context_enabled && level_to_perf_context) { | |||
for (auto iter = level_to_perf_context->begin(); iter != level_to_perf_context->end(); ++iter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do for (auto iter : level_to_perf_context)
?
monitoring/perf_context.cc
Outdated
level_to_perf_context) { \ | ||
ss << #counter << " = "; \ | ||
for (auto iter = level_to_perf_context->begin(); \ | ||
iter != level_to_perf_context->end(); ++iter) { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question, does for (auto iter : level_to_perf_context)
work?
monitoring/perf_context.cc
Outdated
for (auto iter = level_to_perf_context->begin(); \ | ||
iter != level_to_perf_context->end(); ++iter) { \ | ||
if (iter->second) { \ | ||
ss << iter->second->counter << ", "; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be gap for level numbers. I think we should create level number too.
monitoring/perf_context.cc
Outdated
<< "bloom_filter_full_positive = " \ | ||
<< (*level_to_perf_context)[level]->bloom_filter_full_positive << "," \ | ||
<< "bloom_filter_full_true_positive = " \ | ||
<< (*level_to_perf_context)[level]->bloom_filter_full_true_positive << ",";\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add more counters per level, this macro will grow. It doesn't seem to be a scalable solution to me.
I don't see the necessarily of this macro. Can you use a helper function instead?
include/rocksdb/perf_context.h
Outdated
@@ -169,6 +189,8 @@ struct PerfContext { | |||
uint64_t env_lock_file_nanos; | |||
uint64_t env_unlock_file_nanos; | |||
uint64_t env_new_logger_nanos; | |||
std::map<uint32_t, PerfContextByLevel*>* level_to_perf_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You chose PerfContextByLevel*
instead of PerfContextByLevel
. What's the reason behind it? Is it to save memory waste? One clear downside is that, there will be memory leak after a thread is terminated.
monitoring/perf_context_imp.h
Outdated
if (perf_level >= PerfLevel::kEnableCount && \ | ||
perf_context.per_level_perf_context_enabled && \ | ||
perf_context.level_to_perf_context) { \ | ||
if ((*(perf_context.level_to_perf_context))[level]) { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read std::map doc, but determining non existing element to be null and set it if it is still confusing to me: https://en.cppreference.com/w/cpp/container/map/operator_at
I suggest we write it in a more explicit way: find() and if we can't find it, set a new one.
table/block_based_table_reader.cc
Outdated
@@ -2307,6 +2307,7 @@ bool BlockBasedTable::FullFilterKeyMayMatch( | |||
} | |||
if (may_match) { | |||
RecordTick(rep_->ioptions.statistics, BLOOM_FILTER_FULL_POSITIVE); | |||
PERF_COUNTER_BY_LEVEL_ADD(bloom_filter_full_positive, 1, 0/*level*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why level 0?
table/block_based_table_reader.cc
Outdated
@@ -2331,6 +2332,8 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key, | |||
if (!FullFilterKeyMayMatch(read_options, filter, key, no_io, | |||
prefix_extractor)) { | |||
RecordTick(rep_->ioptions.statistics, BLOOM_FILTER_USEFUL); | |||
// TODO(Zhongyi): use the correct level here | |||
PERF_COUNTER_BY_LEVEL_ADD(bloom_filter_useful, 1, 0/*level*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why level 0?
table/block_based_table_reader.cc
Outdated
@@ -2415,6 +2420,7 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key, | |||
} | |||
if (matched && filter != nullptr && !filter->IsBlockBased()) { | |||
RecordTick(rep_->ioptions.statistics, BLOOM_FILTER_FULL_TRUE_POSITIVE); | |||
PERF_COUNTER_BY_LEVEL_ADD(bloom_filter_full_true_positive, 1, 0/*level*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the same question. Why level 0?
…e more insight into query performance
09442d0
to
2373d26
Compare
thanks @siying for reviewing the PR, all comments have been addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@@ -168,6 +188,8 @@ struct PerfContext { | |||
uint64_t env_lock_file_nanos; | |||
uint64_t env_unlock_file_nanos; | |||
uint64_t env_new_logger_nanos; | |||
std::map<uint32_t, PerfContextByLevel>* level_to_perf_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think about it more. Not just PerfContextByLevel, but the map itself is leaked after the thread exits. Does C++ allow a constructor of PerfContext? If it allows, we should destruct the map. If not, we should clearly comment in the EnablePerLevelPerfContext() that the user is responsible to call ClearPerLevelPerfContext() before the thread exits. Otherwise, something will be leaked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add a destructor for PerfContext class but got compilation error:
monitoring/perf_context.cc:18:22: error: non-local variable ‘rocksdb::perf_context’ declared ‘__thread’ has a non-trivial destructor
__thread PerfContext perf_context;
monitoring/perf_context.cc:18:22: note: C++11 ‘thread_local’ allows dynamic initialization and destruction
I will add comments to advise users to always pair EnablePerLevelPerfContext()
with ClearPerLevelPerfContext()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it will make the diff's scope large, but we didn't use "thread_local" key-word because we wanted to support GCC 4.7. Now 4.7 doesn't work anyway, and 4.8 is the minimum version we support. How about change to use thread_local keyword instead of __thread now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested with thread_local
and make check passed, I will replace __thread with thread_local in a separate PR
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@@ -171,20 +172,30 @@ TEST_F(DBBloomFilterTest, GetFilterByPrefixBloomCustomPrefixExtractor) { | |||
|
|||
ASSERT_EQ("foo", Get("barbarbar")); | |||
ASSERT_EQ(TestGetTickerCount(options, BLOOM_FILTER_USEFUL), 0); | |||
// TODO (Zhongyi): uncomment the asserts involving level_to_perf_context when per | |||
// level perf context is enabled in block based table reader | |||
// ASSERT_EQ(0, (*(get_perf_context()->level_to_perf_context))[0].bloom_filter_useful); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With all those commented out, the unit tests won't be able to catch any regression bug of per level perf context. I don't think it is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the per level perf context is not used in this PR anyway (because level info is missing in current implementation of block based table reader), so there is really nothing much to test at this point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can manually call the MACRO to set the counters and verify them.
Also, it's not a good idea to have counters that are never set. It confuses people. I suggest the follow-up PR that correctly set those counters be committed as soon as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added test coverage in PerfContextTest.PerfContextByLevelGetSet
to do set and get of per level perf context
@@ -186,8 +205,29 @@ std::string PerfContext::ToString(bool exclude_zero_counters) const { | |||
PERF_CONTEXT_OUTPUT(env_lock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_unlock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_new_logger_nanos); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_useful); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_full_positive); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_full_true_positive); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious how did you choose between printing all counters for a level, and under every counter, show every level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added PerfContextByLevel::ToString
to print all counters at a particular level, user can call by (*(get_perf_context()->level_to_perf_context))[level].ToString(true);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid to add one extra ToString() unless we are convinced that it is needed. My initial question is that, how did you choose between this format:
bloom_filter_useful: 5@0, 4@1, 0@2
bloom_filter_full_positive: 0@0, 0:1, 0:2
and
Level 0:
bloom_filter_useful: 5
bloom_filter_full_positive: 0
Level 1:
bloom_filter_useful: 4
bloom_filter_full_positive: 0
Level 2:
bloom_filter_useful: 0
bloom_filter_full_positive: 0
I'm OK with either way. I'm just curious on how it was decided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose the former as default since I think in most cases we want to know the distribution of a particular counter at different levels, but also added ToString() to support the latter because the latter might also be useful when user want to browse through all counters and get a bigger picture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we remove this ToString() for now. I'm not sure that people will use it.
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
include/rocksdb/perf_context.h
Outdated
@@ -7,6 +7,7 @@ | |||
|
|||
#include <stdint.h> | |||
#include <string> | |||
#include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m is before s.
@@ -186,8 +205,29 @@ std::string PerfContext::ToString(bool exclude_zero_counters) const { | |||
PERF_CONTEXT_OUTPUT(env_lock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_unlock_file_nanos); | |||
PERF_CONTEXT_OUTPUT(env_new_logger_nanos); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_useful); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_full_positive); | |||
PERF_CONTEXT_BY_LEVEL_OUTPUT_ONE_COUNTER(bloom_filter_full_true_positive); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid to add one extra ToString() unless we are convinced that it is needed. My initial question is that, how did you choose between this format:
bloom_filter_useful: 5@0, 4@1, 0@2
bloom_filter_full_positive: 0@0, 0:1, 0:2
and
Level 0:
bloom_filter_useful: 5
bloom_filter_full_positive: 0
Level 1:
bloom_filter_useful: 4
bloom_filter_full_positive: 0
Level 2:
bloom_filter_useful: 0
bloom_filter_full_positive: 0
I'm OK with either way. I'm just curious on how it was decided.
@miasantreble has updated the pull request. Re-import the pull request |
0de04ba
to
45a1562
Compare
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
monitoring/perf_context.cc
Outdated
@@ -13,9 +13,9 @@ namespace rocksdb { | |||
PerfContext perf_context; | |||
#else | |||
#if defined(OS_SOLARIS) | |||
__thread PerfContext perf_context_; | |||
thread_local PerfContext perf_context_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it supported in OS_SOLARIS?
include/rocksdb/perf_context.h
Outdated
struct PerfContext { | ||
|
||
void Reset(); // reset all performance counters to zero | ||
|
||
std::string ToString(bool exclude_zero_counters = false) const; | ||
|
||
// enable per level perf context and allocate storage for PerfContextByLevel | ||
// Always pair calls to `EnablePerLevelPerfContext()` with | ||
// `ClearPerLevelPerfContext`, otherwise there will be memory leak |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the destructor that can prevent this leak?
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention this feature and the thread_local change in HISTORY.md.
monitoring/perf_context.cc
Outdated
|
||
void PerfContext::ClearPerLevelPerfContext(){ | ||
if (level_to_perf_context) { | ||
level_to_perf_context->clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to clear before delete it?
@miasantreble has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miasantreble has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: PR #4226 introduced per-level perf context which allows breaking down perf context by levels. This PR takes advantage of the feature to populate a few counters related to bloom filters Pull Request resolved: #4581 Differential Revision: D10518010 Pulled By: miasantreble fbshipit-source-id: 011244561783ec860d32d5b0fa6bce6e78d70ef8
Current implementation of perf context is level agnostic. Making it hard to do performance evaluation for the LSM tree. This PR adds
PerfContextByLevel
to decompose the counters by level.This will be helpful when analyzing point and range query performance as well as tuning bloom filter
Also replaced __thread with thread_local keyword for perf_context