Slice list/list_view elements in duckdb exporter#8020
Conversation
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.932x ➖ datafusion / vortex-file-compressed (0.932x ➖, 2↑ 0↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.955x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.961x ➖, 0↑ 0↓)
datafusion / parquet (0.926x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.944x ➖, 3↑ 0↓)
duckdb / vortex-compact (0.974x ➖, 0↑ 0↓)
duckdb / parquet (0.952x ➖, 1↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.941x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.950x ➖, 0↑ 0↓)
datafusion / parquet (0.967x ➖, 0↑ 0↓)
datafusion / arrow (0.930x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.967x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.958x ➖, 1↑ 0↓)
duckdb / parquet (0.973x ➖, 2↑ 0↓)
duckdb / duckdb (0.979x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.997x ➖, 0↑ 1↓)
datafusion / vortex-compact (0.989x ➖, 2↑ 0↓)
datafusion / parquet (1.003x ➖, 1↑ 2↓)
duckdb / vortex-file-compressed (1.001x ➖, 1↑ 1↓)
duckdb / vortex-compact (0.995x ➖, 2↑ 0↓)
duckdb / parquet (1.003x ➖, 0↑ 2↓)
duckdb / duckdb (0.998x ➖, 0↑ 2↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: Statistical and Population GeneticsVerdict: Likely improvement (low confidence) duckdb / vortex-file-compressed (0.751x ✅, 4↑ 0↓)
duckdb / vortex-compact (0.787x ✅, 4↑ 0↓)
duckdb / parquet (0.990x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.135x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.053x ➖, 0↑ 1↓)
datafusion / parquet (1.204x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.104x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.107x ➖, 0↑ 1↓)
duckdb / parquet (1.041x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.940x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.934x ➖, 0↑ 0↓)
datafusion / parquet (0.939x ➖, 0↑ 0↓)
datafusion / arrow (0.887x ✅, 15↑ 0↓)
duckdb / vortex-file-compressed (0.934x ➖, 3↑ 0↓)
duckdb / vortex-compact (0.957x ➖, 0↑ 0↓)
duckdb / parquet (0.957x ➖, 0↑ 0↓)
duckdb / duckdb (0.971x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.008x ➖, 2↑ 2↓)
datafusion / parquet (1.009x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.006x ➖, 1↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 0↓)
duckdb / duckdb (0.995x ➖, 1↑ 4↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.080x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.115x ➖, 0↑ 4↓)
datafusion / parquet (1.245x ➖, 0↑ 7↓)
duckdb / vortex-file-compressed (1.051x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.064x ➖, 0↑ 0↓)
duckdb / parquet (1.077x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.173x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.065x ➖, 0↑ 1↓)
datafusion / parquet (1.158x ➖, 0↑ 5↓)
duckdb / vortex-file-compressed (1.058x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.038x ➖, 0↑ 0↓)
duckdb / parquet (1.102x ➖, 0↑ 0↓)
Full attributed analysis
|
f091924 to
3f54aff
Compare
| let offset = offsets[i] | ||
| .to_u64() | ||
| .ok_or_else(|| vortex_err!("somehow unable to convert an offset to u64"))?; | ||
| .ok_or_else(|| vortex_err!("conversion failed)"))?; |
There was a problem hiding this comment.
these error will be slower and you really don't need them twice? (or even once??)
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
3f54aff to
54f5efc
Compare
Merging this PR will improve performance by 16.48%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
308 µs | 272.9 µs | +12.86% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(1000, 10)] |
197.8 µs | 162 µs | +22.09% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
358.5 µs | 323.5 µs | +10.82% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
211.9 µs | 175.8 µs | +20.55% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing myrrc/duckdb-list-exporter-slice (54f5efc) with develop (8aaaab8)
We always reference the underlying vector for every list/list_view chunk
export. This means, for a 1M vector and export of first element, the underlying
vector would still be 1M.
This PR slices the underlying vector for list and list views exporter to only
contain requested elements by offsets.