Skip to content

feat(storage): analyze table noscan #18254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 30, 2025
Merged

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented Jun 26, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces a lightweight ANALYZE TABLE mode that avoids scanning all rows.
Instead of reading the entire table, it traverses all segments and blocks, incrementally merging NDV (number of distinct values) from the block level to the segment level, and finally to the snapshot level.
This reduces memory and CPU usage significantly by eliminating full table decoding and recomputation.

Syntax:

The NOSCAN syntax is inspired by implementations in Spark and DB2.

ANALYZE TABLE <table_name> NOSCAN;

Behavior:

  • New tables no longer need to run ANALYZE TABLE ... NOSCAN. Their snapshots will automatically merge NDV information during creation.

  • Existing tables can be updated with a one-time execution of:

    ANALYZE TABLE <table_name> [NOSCAN];

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@zhyass zhyass marked this pull request as draft June 26, 2025 05:05
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Jun 26, 2025
@zhyass zhyass added the ci-cloud Build docker image for cloud test label Jun 28, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-18254-2c7869e-1751078650

note: this image tag is only available for internal use.

@zhyass zhyass marked this pull request as ready for review June 28, 2025 11:55
@dantengsky dantengsky merged commit e486e3f into databendlabs:main Jun 30, 2025
164 of 166 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants