Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite countDistinctIf with count_distinct_implementation configuration. #42008

Closed
wants to merge 5 commits into from
Closed

Conversation

Eridanus117
Copy link
Contributor

@Eridanus117 Eridanus117 commented Oct 2, 2022

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Rewrite countDistinctIf with count_distinct_implementation configuration.

@CLAassistant
Copy link

CLAassistant commented Oct 2, 2022

CLA assistant check
All committers have signed the CLA.

@robot-ch-test-poll robot-ch-test-poll added the pr-performance Pull request with some performance improvements label Oct 2, 2022
@Eridanus117
Copy link
Contributor Author

Eridanus117 commented Oct 2, 2022

This should fix #30642.

@alexey-milovidov
Hi, should I make it a config?

@sundy-li
Hi, may I ask why there is a performance difference between countDistinctIf and uniqExactIf?

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Oct 2, 2022
@alexey-milovidov
Copy link
Member

@Eridanus117 yes, let's introduce a setting to enable or disable rewriting.
See also compatibility (the setting should be disabled if compatibility is less than 22.10).

@Eridanus117
Copy link
Contributor Author

@Eridanus117 yes, let's introduce a setting to enable or disable rewriting. See also compatibility (the setting should be disabled if compatibility is less than 22.10).

@alexey-milovidov
Done. Please let me know if you have any suggestions or comments, especially regarding naming and where to place the config since I'm not familiar with it.

@kitaisreal kitaisreal self-assigned this Oct 3, 2022
src/Core/Settings.h Outdated Show resolved Hide resolved
tests/queries/0_stateless/01259_combinator_distinct.sql Outdated Show resolved Hide resolved
@ucasfl
Copy link
Collaborator

ucasfl commented Jan 27, 2023

@Eridanus117 Need fix conflict.

@Eridanus117 Eridanus117 reopened this Jan 28, 2023
@Eridanus117 Eridanus117 reopened this Jan 29, 2023
@ucasfl
Copy link
Collaborator

ucasfl commented Jan 29, 2023

LGTM. @kitaisreal Hi, what anything else need to do about this PR?

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jan 29, 2023

@ucasfl the only remaining thing is to check if it works with Analyzer. If it works, let's merge.
Also, one minor concern that I had when reviewing - the setting is named "optimize" while it is not exactly an optimization... I was thinking about that, and looks like it's ok.

@alexey-milovidov alexey-milovidov self-assigned this Jan 29, 2023
@ucasfl
Copy link
Collaborator

ucasfl commented Feb 5, 2023

Continued in #46051.

@ucasfl ucasfl closed this Feb 5, 2023
alexey-milovidov added a commit that referenced this pull request Aug 6, 2023
Continue of #42008, rewrite countDistinctIf with count_distinct_implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants