New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable pure parallel replicas if trivial count optimization is possible #50594
Disable pure parallel replicas if trivial count optimization is possible #50594
Conversation
This is an automated comment for commit 29e886f with description of existing statuses. It's updated for the latest CI running
|
|
&& !settings.allow_experimental_query_deduplication | ||
&& !settings.empty_result_for_aggregation_by_empty_set | ||
&& storage | ||
&& storage->getName() != "MaterializedMySQL" | ||
&& !storage->hasLightweightDeletedMask() | ||
&& query_info.filter_asts.empty() | ||
&& processing_stage == QueryProcessingStage::FetchColumns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will queries which read from Distributed tables break or return possibly wrong result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS no for 2 reasons:
-
The checks is kept in the original place (
InterpreterSelectQuery::executeFetchColumns
) before callinggetTrivialCount
during execution. Ifprocessing_stage != QueryProcessingStage::FetchColumns
then it won't use trivial count, as before. -
StorageDistributed can't use trivial count, which means that
getTrivialCount
during analysis will return an empty value, so it won't change any setting.
I might be missing other scenarios though.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Documentation entry for user-facing changes
XXX
Using parallel replicas to do count() operations when trivial count is possible is extremely disadvantageous (in all cases I'd say) as it means both initializing parallel replicas, organizing the work and then reading the smallest column (vs checking just the part metadata). So in those cases we disable parallel replicas and do all the work in the initiator.
I've only attempted to implement this for pure parallel replicas (both in the old interpreter and the new analyzer) as custom key replicas isn't working right now and dealing with cluster() vs normal storage was more complex (but mainly the first reason).