-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predict size of hash table for GROUP BY #33439
Predict size of hash table for GROUP BY #33439
Conversation
It looks very low level, maybe we can avoid adding this command.
Just check that nothing broken - by our existing tests. |
src/Core/Settings.h
Outdated
@@ -483,6 +483,10 @@ class IColumn; | |||
M(Bool, optimize_rewrite_sum_if_to_count_if, true, "Rewrite sumIf() and sum(if()) function countIf() function when logically equivalent", 0) \ | |||
M(UInt64, insert_shard_id, 0, "If non zero, when insert into a distributed table, the data will be inserted into the shard `insert_shard_id` synchronously. Possible values range from 1 to `shards_number` of corresponding distributed table", 0) \ | |||
\ | |||
M(Bool, collect_hash_table_stats_during_aggregation, true, "Enable collecting hash table statistics to optimize memory allocation", 0) \ | |||
M(UInt64, max_entries_for_hash_table_stats, 10'000, "How much entries hash table statistics collected during aggregation is allowed to have", 0) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much -> many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$CLICKHOUSE_CLIENT -q "SYSTEM FLUSH LOGS" | ||
$CLICKHOUSE_CLIENT \ | ||
--param_query_id="$query_id" \ | ||
-q "WITH ( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work without newline escaping (inside string literal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
BTW, what performance improvement do you observe with manual testing? |
} | ||
|
||
std::mutex mutex; | ||
CachePtr hash_table_stats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth exporting this LRUCache in the same manner as others, in AsynchronousMetrics.cpp:
ClickHouse/src/Interpreters/AsynchronousMetrics.cpp
Lines 545 to 551 in b1d2bdf
{ | |
if (auto uncompressed_cache = getContext()->getUncompressedCache()) | |
{ | |
new_values["UncompressedCacheBytes"] = uncompressed_cache->weight(); | |
new_values["UncompressedCacheCells"] = uncompressed_cache->count(); | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, I replaced in the description the part about SYSTEM DROP HASH TABLE STATISTICS
with what you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Here are some numbers
Since the setting is enabled by default all existing perf tests actually test this logic too, so I didn't spent much time on exploring different queries. And tests actually showed some improvements one, two, but also some regression too, so there are something to look at. Mind to do that the first chance I'll have. UPD: check the comment below. |
In other DMBS like PostgreSQL, there is such commands which allow to alter table data statistics: |
But this statistic is not about tables' data, it is something more ephemeral. |
Some queries from the "group_by_fixed_keys" test have slowed down. It happened because previously we used a hash table of 1024 elements (finally) and now this query will use HT of 512 elements (preallocated at the beginning). This is true for every second power of two, but seems like it makes things worse only for fairly small sizes (~4096 on my machine), but for bigger tables it actually gives us perf benefits (PLEASE check the "Experiment results" section below). So I just left all this logic untouched. Experiment resultsThe following query from the "group_by_fixed_keys" was used: WITH number % SIZE AS k, toUInt64(k) AS k1, k1 + 1 AS k2
SELECT k1, k2, count()
FROM numbers(100000000)
GROUP BY k1, k2
FORMAT Null
SETTINGS collect_hash_table_stats_during_aggregation = CS I encourage you not to compare the values from different columns (but only from adjacent rows), since they are from different runs.
Without preallocation we start with a hash table of For bigger sizes looks also OK (here the benefit of not doing reallocations also kicks in):
Other slowdowns look irrelevant. |
193e203
to
53ca06a
Compare
Seems like smth happened with perf tests in CI, I see the same picture in other recent PRs. |
There are errors in perf tests in master, @Avogar will fix. |
@alexey-milovidov May you pls suggest who I can ask to review? |
Maybe @vdimir can take. |
CI spotted couple of small problems in tests. I will fix them soon (it shouldn't affect the implementation). |
src/Interpreters/Aggregator.cpp
Outdated
} | ||
|
||
private: | ||
CachePtr getHashTableStatsCache(const Params & params, [[maybe_unused]] std::lock_guard<std::mutex> & cache_lock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's cache_lock
passed for? Just not to forget to lock mutex before calling it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, and it makes all operations with cache sequential which is also handy
src/Interpreters/Aggregator.cpp
Outdated
, max_size_to_preallocate_for_aggregation(max_size_to_preallocate_for_aggregation_) | ||
{ | ||
if (!select_query_) | ||
throw Exception(ErrorCodes::LOGICAL_ERROR, "query ptr cannot be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can check that it is actually DB::ASTSelectQuery
, just in case
throw Exception(ErrorCodes::LOGICAL_ERROR, "query ptr cannot be null"); | |
throw Exception(ErrorCodes::LOGICAL_ERROR, "Query ptr cannot be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is being checked in select_query->as<DB::ASTSelectQuery &>()
src/Interpreters/Aggregator.cpp
Outdated
@@ -237,8 +469,7 @@ class CompiledAggregateFunctionsHolder final : public CompiledExpressionCacheEnt | |||
|
|||
#endif | |||
|
|||
Aggregator::Aggregator(const Params & params_) | |||
: params(params_) | |||
Aggregator::Aggregator(const Params & params_) : params(params_), stats_updater(std::make_unique<StatisticsUpdater>(*this)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing this
from ctor looks dangerous because object is not fully constructed yet, but seems in this case it's fine, we will just store refernce to this somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe not to construct stats_updater
at all if statistics in disabled, just leave nullptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe not to construct stats_updater at all if statistics in disabled, just leave nullptr?
done
src/Interpreters/Aggregator.cpp
Outdated
const auto cache = getHashTableStatsCache(params, lock); | ||
const auto hint = cache->get(params.key); | ||
// We'll maintain the maximum among all the observed values until the next prediction turns out to be too wrong. | ||
if (!hint || observed_size > *hint || observed_size < *hint / 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really minor thing, but I suppose it's easier to read it like that, because we immediately see range when condition is not apllied [hint/2, hint]
if (!hint || observed_size > *hint || observed_size < *hint / 2) | |
if (!hint || observed_size < *hint / 2 || *hint < observed_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/Interpreters/Aggregator.cpp
Outdated
} | ||
} | ||
|
||
CachePtr getCache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used only for getStats
, so maybe let's expose getStats
method here and return HashTablesCacheStatistics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -912,29 +913,60 @@ class Aggregator final | |||
bool compile_aggregate_expressions; | |||
size_t min_count_to_compile_aggregate_expression; | |||
|
|||
struct StatsCollectingParams | |||
{ | |||
StatsCollectingParams(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's default contructor used for? I suppose we can provide more gurantees that we won't get inconsistent StatsCollectingParams, if we either:
- remove default ctor,
- use
key != 0
instead ofcollect_hash_table_stats_during_aggregation
(it can't becollect_hash_table_stats_during_aggregation = true
andkey = 0
at the same time, isn't it? In this casecollect_hash_table_stats_during_aggregation
may be reucntant) - make fileds const to be sure that we construct only valid params
- Or even create
StatsCollectingParams
only whencollect_hash_table_stats_during_aggregation = true
, otherwise leave it nullptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default ctor is used in those calls where we don't pass any statistics collection related params.
use key != 0 instead of collect_hash_table_stats_during_aggregation
done
c89daed
to
50bfcc7
Compare
FunctionalStatelessTestFlakyCheck failed because of timeout - 300 long tests haven't fit in timeout. but there were no fails, so I think we can ignore that. |
I want to work on it a little |
96477da
to
a6b7f9e
Compare
725b2cf
to
df0dfa9
Compare
df0dfa9
to
6cf6f75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but maybe @kitaisreal want take another look
if (!params.isCollectionAndUseEnabled()) | ||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Collection and use of the statistics should be enabled."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just return null and don't check isCollectionAndUseEnabled
on caller side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer straightforward code )
for (size_t i = 0; i < data_variants.size(); ++i) | ||
sizes[i] = data_variants[i]->size(); | ||
const auto median_size = sizes.begin() + sizes.size() / 2; // not precisely though... | ||
std::nth_element(sizes.begin(), median_size, sizes.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we do use median (but not let's say max)? Actually I'm not familiar with AggregatedDataVariants
and didn't get why we have many data_variants
, could you explain it to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is usually a separate variant for each thread of aggregation. this is a relevant place in the code.
median seems to be the correct choice from statisctical considerations and experiments also agree with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, need minor clarifications, check performance tests and ready to be merge.
src/Common/ProfileEvents.cpp
Outdated
@@ -284,6 +284,9 @@ | |||
\ | |||
M(MainConfigLoads, "Number of times the main configuration was reloaded.") \ | |||
\ | |||
M(HashTablesPreallocatedElements, "How many elements were preallocated in hash tables for aggregation.") \ | |||
M(HashTablesInitedAsTwoLevel, "How many hash tables were inited as two-level for aggregation.") \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to rename setting into something related to aggregation, and also Inited
looks strange, maybe AggregationHashTablesInitializedAsTwoLevel
, althought I am not sure it this event make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
src/Common/ProfileEvents.cpp
Outdated
@@ -284,6 +284,9 @@ | |||
\ | |||
M(MainConfigLoads, "Number of times the main configuration was reloaded.") \ | |||
\ | |||
M(HashTablesPreallocatedElements, "How many elements were preallocated in hash tables for aggregation.") \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, we need to change name to understand that this is related to aggregation. And also could you explain why how this setting could help client ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is mostly for testing and could be useful in debuging since all the added traces are of the 'debug' level
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Collection and use of the statistics should be enabled."); | ||
|
||
std::lock_guard lock(mutex); | ||
const auto cache = getHashTableStatsCache(params, lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutex could be used only to protect getting or initializing cache, after that LRUCache guarantee thread safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's true, but it is nice to have operations with cache sequential
Performance tests looks good. We can additionally investigate why prellocating does not work good for StringHashTable in some separate pull request. |
my understanding is that it is because StringHT is 4 HT under the hood for different size classes and unless the distribution of input strings between size classes is uniform preallocation will not make much difference. |
Measurements done on my dev-machine
For example, for the Also an interesting side-effect. |
This reverts commit 6cf6f75.
Performance numbers are really great 👍 |
@@ -0,0 +1,96 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickitat need to fix this test.
failures look irrelevant, but lets rerun just in case |
@Mergifyio update |
✅ Branch has been successfully updated |
if (worthConvertToTwoLevel( | ||
params.group_by_two_level_threshold, | ||
hint->sum_of_sizes, | ||
/*group_by_two_level_threshold_bytes*/ 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the two-level aggregate never be used? why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be activated if sum_of_sizes >= params.group_by_two_level_threshold
Hi guys @nickitat , have you thought about apply the similar strategy for the hash join? I am wondering the improvement it can bring to the hash join. |
@Nathaniel-Han it should help and worth trying, |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Sizes of hash tables used during aggregation now collected and used in later queries to avoid hash tables resizes.
Detailed description / Documentation draft:
...
I see the following options to improve the current implementation:
Tl;dr results
Issue #24317