Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: make stats consistent between lookup join and hash join #56441

Open
rytaft opened this issue Nov 9, 2020 · 0 comments
Open

opt: make stats consistent between lookup join and hash join #56441

rytaft opened this issue Nov 9, 2020 · 0 comments
Labels
A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team

Comments

@rytaft
Copy link
Collaborator

rytaft commented Nov 9, 2020

The statisticsBuilder often does a poor job of estimating the statistics of lookup joins, resulting in a different estimate for a lookup join and equivalent hash join. We have so far been avoiding the potential problems caused by this by ensuring that the lookup join is added to the same memo group as the original join, so that it gets the same statistics. However, if we need to wrap the lookup join with a Project operator, it ends up getting different statistics. This has turned out to be a problem in #56393, where we must wrap anti and semi paired-joins with a Project operator.

Epic CRDB-16930

Jira issue: CRDB-2941

Jira issue: CRDB-13903

@rytaft rytaft added C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-optimizer SQL logical planning and optimizations. labels Nov 9, 2020
@rytaft rytaft self-assigned this Nov 9, 2020
@rytaft rytaft added this to Triage in BACKLOG, NO NEW ISSUES: SQL Optimizer via automation Nov 9, 2020
@RaduBerinde RaduBerinde moved this from Triage to Infrastructure & performance in BACKLOG, NO NEW ISSUES: SQL Optimizer Jan 11, 2021
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@michae2 michae2 added the A-sql-table-stats Table statistics (and their automatic refresh). label Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team
Projects
BACKLOG, NO NEW ISSUES: SQL Optimizer
Infrastructure & performance
Status: Backlog
Development

No branches or pull requests

3 participants