Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
Polar Signals Profiling ResultsLatest Run
Previous Runs (2)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.921x ➖ datafusion / vortex-file-compressed (0.921x ➖, 1↑ 0↓)
|
File Sizes: PolarSignals ProfilingFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.050x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.024x ➖, 0↑ 1↓)
datafusion / parquet (1.072x ➖, 1↑ 6↓)
datafusion / arrow (1.038x ➖, 1↑ 4↓)
duckdb / vortex-file-compressed (1.081x ➖, 0↑ 4↓)
duckdb / vortex-compact (1.067x ➖, 0↑ 4↓)
duckdb / parquet (1.036x ➖, 3↑ 5↓)
duckdb / duckdb (1.003x ➖, 1↑ 1↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMEFile Size Changes (11 files changed, +0.0% overall, 2↑ 9↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.998x ➖, 1↑ 1↓)
datafusion / vortex-compact (0.971x ➖, 1↑ 0↓)
datafusion / parquet (0.985x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.998x ➖, 2↑ 1↓)
duckdb / vortex-compact (1.008x ➖, 0↑ 1↓)
duckdb / parquet (0.942x ➖, 3↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeFile Size Changes (2 files changed, +0.0% overall, 1↑ 1↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.893x ✅, 59↑ 0↓)
datafusion / vortex-compact (0.902x ➖, 48↑ 0↓)
datafusion / parquet (0.909x ➖, 39↑ 0↓)
duckdb / vortex-file-compressed (0.913x ➖, 32↑ 0↓)
duckdb / vortex-compact (0.925x ➖, 19↑ 0↓)
duckdb / parquet (0.927x ➖, 25↑ 0↓)
duckdb / duckdb (0.933x ➖, 19↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMEFile Size Changes (27 files changed, +0.5% overall, 9↑ 18↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.000x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.965x ➖, 2↑ 0↓)
datafusion / parquet (0.983x ➖, 0↑ 0↓)
datafusion / arrow (0.976x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.971x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.971x ➖, 0↑ 0↓)
duckdb / parquet (0.934x ➖, 2↑ 0↓)
duckdb / duckdb (0.999x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMEFile Size Changes (38 files changed, +0.0% overall, 14↑ 24↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.089x ➖, 1↑ 3↓)
datafusion / vortex-compact (1.052x ➖, 0↑ 0↓)
datafusion / parquet (0.914x ➖, 4↑ 2↓)
duckdb / vortex-file-compressed (1.006x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.082x ➖, 0↑ 1↓)
duckdb / parquet (1.011x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.046x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.086x ➖, 0↑ 2↓)
datafusion / parquet (1.061x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.967x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.996x ➖, 0↑ 0↓)
duckdb / parquet (1.023x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.053x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.053x ➖, 0↑ 1↓)
duckdb / parquet (1.018x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsFile Size Changes (2 files changed, +5.0% overall, 1↑ 1↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 0.900x ✅ unknown / unknown (0.990x ➖, 6↑ 0↓)
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.999x ➖, 0↑ 1↓)
datafusion / parquet (1.003x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.962x ➖, 4↑ 0↓)
duckdb / parquet (0.997x ➖, 0↑ 0↓)
duckdb / duckdb (0.977x ➖, 3↑ 0↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (201 files changed, +0.5% overall, 130↑ 71↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.988x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.056x ➖, 0↑ 1↓)
datafusion / parquet (1.094x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.076x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.011x ➖, 0↑ 2↓)
duckdb / parquet (1.087x ➖, 0↑ 3↓)
Full attributed analysis
|
Benchmarks: CompressionVortex (geomean): 1.003x ➖ unknown / unknown (1.007x ➖, 0↑ 0↓)
|
58b212d to
b03e192
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces an experimental “patched array” pathway for FastLanes BitPacked data, enabling legacy BitPacked-with-interior-patches payloads to deserialize as a Patched wrapper, and wiring the same mode into compression + file encoding allowlists.
Changes:
- Add a BitPacked deserialization plugin that rewrites interior patches into a
Patchedwrapper whenVORTEX_EXPERIMENTAL_PATCHED_ARRAYis set. - Update integer bitpacking compression to emit
Patched(instead of interior patches) in experimental mode and update file allowed-encodings accordingly. - Enable the experimental mode in benchmark GitHub Actions workflows via env var.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
vortex-file/src/strategy.rs |
Conditionally allows Patched encoding for file writing when experimental mode is enabled. |
vortex-btrblocks/src/schemes/integer.rs |
Switches BitPacking scheme output to Patched in experimental mode; keeps legacy interior-patches mode otherwise. |
encodings/fastlanes/src/lib.rs |
Adds USE_EXPERIMENTAL_PATCHES flag and conditionally registers a custom BitPacked plugin. |
encodings/fastlanes/src/bitpacking/plugin.rs |
New plugin to deserialize legacy BitPacked-with-patches as a Patched wrapper around a patchless BitPacked array. |
encodings/fastlanes/src/bitpacking/mod.rs |
Wires the new plugin module into the bitpacking module. |
encodings/fastlanes/public-api.lock |
Records new public static USE_EXPERIMENTAL_PATCHES. |
.github/workflows/sql-benchmarks.yml |
Forces experimental patched-array mode during SQL benchmark data generation. |
.github/workflows/bench.yml |
Forces experimental patched-array mode during benchmark runs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
d37a0c6 to
422848b
Compare
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
0c5c34a to
c47b70a
Compare
Summary
Part of roll out of Patched array. This PR adds a new environment variable,
VORTEX_EXPERIMENTAL_PATCHED_ARRAY.When the env var is set
BitPackedarray with interior patches will be decoded as aPatchedarray wrapping aBitPackedwith no interior patchesThis allows us to do some more soak testing of the encoding before we switch it to be the default.
Testing
On a previous commit in this PR, I've setup the benchmarks to run with the experimental Patched encoding enabled, the benchmark results all seem within the margin of error. The file sizes are generally the same but in the worst case, 2-3% larger.