fix: replace old storage format with the 1.5.2 to enable advanced compression methods#5332
Merged
Merged
Conversation
Contributor
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
46bdb49 to
aeaa98f
Compare
aeaa98f to
ed349a8
Compare
ed349a8 to
0f8f2a7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pin DuckDB storage format to v1.5.2 to enable advanced compression
What
Newly created DuckDB databases used the conservative default storage
format, which gates out modern column compression (e.g. DICT_FSST) and
forces the legacy separate Dictionary/FSST encodings. This inflated the
on-disk database considerably. Pinning the storage version to v1.5.2
lets DuckDB pick the compact compression schemes.
Key changes
config.options.serialization_compatibilitytoSerializationCompatibility::FromString("v1.5.2")in DuckdbManager::Initialize().runtime/fiber_context.c→fiber_context.cppand updateruntime/CMakeLists.txtaccordingly.How to test
Create a fresh database and load ClickBench data, then:
SELECT run_in_duckdb('SELECT column_name, compression FROM pragma_storage_info(''test.hits'') GROUP BY 1,2');Expect
DICT_FSSTon string columns and a smallerPRAGMA database_size.Note: the storage version applies only to newly created files — existing DBs must be reloaded.