Background
The miniblock encoder picks an inline bitpacking width by walking {8, 16, 32, 64} and choosing the narrowest one that fits the column's value range, but anything wider — i.e. decimal128 — falls through to raw 128-bit storage today, even when the actual values comfortably fit in 32 or 64 bits. In practice, real-world decimal128 columns almost never use the full 128-bit range; money, prices, taxes, accounting figures all sit in narrow ranges, so a non-trivial fraction of every dataset that uses decimal128 is being stored at the wrong bit-width.
Impact
I measured this on TPC-DS SF=100 store_sales (288 M rows, 23 columns including 12 × decimal128(7,2)):
|
Without u128 bitpacking |
With u128 bitpacking |
| On-disk size |
34 GiB |
15.873 GiB |
| Bytes per row |
~127 |
~59 |
A ~53 % reduction entirely on the decimal128 columns, with schema, row count, and file format version (v2.1) all unchanged. This isn't a TPC-DS quirk — decimal(7,2) only needs ~24 bits of actual range, and that pattern repeats across most real decimal columns I've seen.
Proposal
Extend the miniblock bitpacking chooser to also consider bits = 128 and route that case through a scalar BitPacking kernel for u128. The picked width is still chosen the same way (narrowest that fits), so a pathological column that genuinely uses the full 128-bit range stays at 128 bits and pays nothing extra.
Scope of the change is narrow:
rust/compression/bitpacking/ — add the scalar u128 kernel.
rust/lance-encoding/src/encodings/physical/bitpacking.rs — wire u128 through the miniblock encode/decode path.
rust/lance-encoding/src/compression.rs — extend the chooser to match bits ∈ {8, 16, 32, 64, 128}.
rust/lance-encoding/src/statistics.rs — minor stat-related plumbing.
No new public API, no on-wire format change beyond a bit-width that v2.1 readers already accept (the encoder just didn't previously emit it), and no FastLanes-transposed kernel for u128 — scalar only for now, that's a natural follow-up.
Limitations
- Decimal128-only. Other types are unchanged.
- Range-dependent. Columns whose values genuinely span the full 128-bit range won't compress and fall through to raw 128 just like today.
Background
The miniblock encoder picks an inline bitpacking width by walking
{8, 16, 32, 64}and choosing the narrowest one that fits the column's value range, but anything wider — i.e.decimal128— falls through to raw 128-bit storage today, even when the actual values comfortably fit in 32 or 64 bits. In practice, real-worlddecimal128columns almost never use the full 128-bit range; money, prices, taxes, accounting figures all sit in narrow ranges, so a non-trivial fraction of every dataset that usesdecimal128is being stored at the wrong bit-width.Impact
I measured this on TPC-DS SF=100
store_sales(288 M rows, 23 columns including 12 ×decimal128(7,2)):A ~53 % reduction entirely on the decimal128 columns, with schema, row count, and file format version (v2.1) all unchanged. This isn't a TPC-DS quirk —
decimal(7,2)only needs ~24 bits of actual range, and that pattern repeats across most real decimal columns I've seen.Proposal
Extend the miniblock bitpacking chooser to also consider
bits = 128and route that case through a scalarBitPackingkernel foru128. The picked width is still chosen the same way (narrowest that fits), so a pathological column that genuinely uses the full 128-bit range stays at 128 bits and pays nothing extra.Scope of the change is narrow:
rust/compression/bitpacking/— add the scalar u128 kernel.rust/lance-encoding/src/encodings/physical/bitpacking.rs— wire u128 through the miniblock encode/decode path.rust/lance-encoding/src/compression.rs— extend the chooser to matchbits ∈ {8, 16, 32, 64, 128}.rust/lance-encoding/src/statistics.rs— minor stat-related plumbing.No new public API, no on-wire format change beyond a bit-width that v2.1 readers already accept (the encoder just didn't previously emit it), and no FastLanes-transposed kernel for u128 — scalar only for now, that's a natural follow-up.
Limitations