Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

always try to push block filter for 1.1 #15409

Merged
merged 6 commits into from Apr 10, 2024

Conversation

badboynt1
Copy link
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #https://github.com/matrixorigin/MO-Cloud/issues/2946

What this PR does / why we need it:

1.1版本的stats计算不够准确,导致经常有该下推block filter的时候没有下推。
对于所有这类问题,在1.1里总是下推。从1.2版本开始计算更加准确,应该可以彻底解决问题

@matrix-meow matrix-meow added the size/XS Denotes a PR that changes [1, 9] lines label Apr 9, 2024
@mergify mergify bot requested a review from sukki37 April 9, 2024 08:03
@mergify mergify bot added the kind/bug Something isn't working label Apr 9, 2024
@matrix-meow
Copy link
Contributor

@badboynt1 Thanks for your contributions!

Pull Request Review:

Title:

The title of the pull request is clear and indicates the purpose of the changes made.

Body:

The body of the pull request provides information about the type of PR, the specific issue it fixes, and the reason for the changes. It also mentions that the stats calculation in version 1.1 is not accurate, leading to issues with pushing block filters. The solution proposed is to always push block filters for such cases in version 1.1.

Changes:

  1. The changes in the stats.go file involve removing certain conditions related to the selectivity of expressions in the estimateFilterBlockSelectivity function.
  2. The removal of the conditions that check for low selectivity and low NDV (Number of Distinct Values) thresholds suggests a change in the logic to always return a selectivity of 0.5.

Feedback and Suggestions for Improvement:

  1. Loss of Selectivity Check:

    • The removal of the condition if expr.Selectivity < 0.01 might lead to inaccurate selectivity estimates in certain cases. It's important to ensure that selectivity is calculated accurately to optimize query plans.
    • Suggestion: Instead of removing this condition entirely, consider adjusting the threshold or finding a more accurate way to handle selectivity calculations.
  2. NDV Threshold Removal:

    • Eliminating the condition if getExprNdv(expr, builder) < blockNDVThreshHold may impact the accuracy of block filter selectivity estimates based on the number of distinct values.
    • Suggestion: Revisit the logic for handling NDV thresholds to maintain accurate estimations while ensuring efficient query processing.
  3. Code Optimization:

    • The changes made seem to simplify the logic by always returning a selectivity of 0.5. While this may address the immediate issue, it's essential to ensure that this approach does not introduce new inaccuracies.
    • Suggestion: Consider refining the logic to provide a more nuanced approach to selectivity estimation, balancing accuracy and performance.
  4. Security Concerns:

    • It's crucial to consider any security implications of altering selectivity calculations, as inaccurate estimations could potentially lead to inefficient query plans or data leakage.
    • Suggestion: Conduct thorough testing to validate the changes and ensure that they do not introduce vulnerabilities or compromise data security.
  5. Documentation Update:

    • It would be beneficial to update the relevant documentation to reflect the changes made in the estimateFilterBlockSelectivity function for better code maintenance and understanding.
    • Suggestion: Include comments or documentation within the code to explain the rationale behind the modifications and their impact on query optimization.

By addressing the mentioned points and considering the suggestions provided, the pull request can be enhanced to improve the accuracy and efficiency of block filter selectivity calculations while maintaining code integrity and security.

@matrix-meow matrix-meow added size/S Denotes a PR that changes [10,99] lines and removed size/XS Denotes a PR that changes [1, 9] lines labels Apr 10, 2024
@sukki37 sukki37 merged commit 9addf43 into matrixorigin:1.1-dev Apr 10, 2024
16 checks passed
@badboynt1 badboynt1 deleted the blockfilter1.1 branch April 10, 2024 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working size/S Denotes a PR that changes [10,99] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants