Skip to content

data files: add version into meta-header#18226

Merged
AskAlexSharov merged 4 commits intomainfrom
eastorski/upgraded-file-compression-format
Dec 14, 2025
Merged

data files: add version into meta-header#18226
AskAlexSharov merged 4 commits intomainfrom
eastorski/upgraded-file-compression-format

Conversation

@eastorski
Copy link
Member

@eastorski eastorski commented Dec 9, 2025

This PR changes the internal format of snapshot files, specifically how the files are written to disk:

  1. The logic of the first byte in the file has been redefined — it is now used for snapshot versioning. Before this PR, this byte was always 0x00, which effectively meant version 0. In this case, the compression and decompression behavior remains unchanged.
  2. For version 0x01, two additional bytes are added at the beginning of the file:
    the first indicates the version, the second is a feature bitmask used during compression of the file.
  3. At the moment, three bits are reserved: page-level compression, keys compression, and values compression.
    If the page-level compression flag is enabled, a third additional byte follows the first two bytes in the file. It indicates the compression level — the number of elements per page.
  4. The rest of the file has the same format as in version 0.

@eastorski eastorski linked an issue Dec 9, 2025 that may be closed by this pull request
3 tasks
@AskAlexSharov AskAlexSharov changed the title Eastorski/upgraded file compression format data files: add version into meta-header Dec 10, 2025
Copy link
Collaborator

@AskAlexSharov AskAlexSharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccountsDomain.History page-level compression enable - it's for another PR

return err
}

if c.featureFlagBitmask&PageLevelCompressionEnabled == PageLevelCompressionEnabled {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • add type for featureFlagBitmask and add method .Has(flag) for that type
  • featureFlagBitmask uint8 - max 8 feature flags - not limiting us too much?
  • compPageValuesCount uint8 - 256 keys per page - not limiting us too much?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this moment only 3 bits are reserved in bitmask, and we use max 64 values per page, this is why I decided to use uint8 for both types.

We always can upgrade if we need - because we still control version byte.

d.version = d.data[0]

// 1st byte: version,
// 2nd byte: defines how exactly the file is compressed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is valid not for all versions - better move it inside if

Hist: HistCfg{
ValuesTable: kv.TblAccountHistoryVals,
CompressorCfg: seg.DefaultCfg, Compression: seg.CompressNone,
CompressorCfg: seg.DefaultCfg.WithValuesOnCompressedPage(16), Compression: seg.CompressNone,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccountsDomain.History page-level compression enable - is for another PR

Accessors: AccessorHashMap,

HistoryLargeValues: false,
HistoryValuesOnCompressedPage: 64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this field can be removed - because HistoryCompressCfg.WithValuesOnCompressedPage added. (up to you)

@eastorski eastorski force-pushed the eastorski/upgraded-file-compression-format branch 3 times, most recently from 8c964af to 6f6f616 Compare December 11, 2025 17:26
@eastorski eastorski force-pushed the eastorski/upgraded-file-compression-format branch from 6f6f616 to a4cbce4 Compare December 12, 2025 12:41
@AskAlexSharov AskAlexSharov merged commit 7cbe547 into main Dec 14, 2025
18 checks passed
@AskAlexSharov AskAlexSharov deleted the eastorski/upgraded-file-compression-format branch December 14, 2025 08:04
AskAlexSharov pushed a commit that referenced this pull request Jan 22, 2026
Cherry-picked this PR: #18226
and also changed snapshots format version from v1 to v0 in compressor
settings because new format will be used as default only since 3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshots: to store read-configs in meta-header

2 participants