Skip to content

don't redownload duckdb for every branch#7747

Draft
myrrc wants to merge 1 commit intodevelopfrom
myrrc/duckdb-no-reload
Draft

don't redownload duckdb for every branch#7747
myrrc wants to merge 1 commit intodevelopfrom
myrrc/duckdb-no-reload

Conversation

@myrrc
Copy link
Copy Markdown
Contributor

@myrrc myrrc commented May 1, 2026

OUT_DIR behaviour changed, so duckdb/ folder now is relative to vortex-duckdb.
Also, add LTO builds for cpp part of duckdb extension (won't harm).
Clarify why we don't have LTO for our main crate.
Add -Werror for duckdb builds

Signed-off-by: Mikhail Kot <to@myrrc.dev>
@myrrc myrrc force-pushed the myrrc/duckdb-no-reload branch from f3bef2b to 849b787 Compare May 1, 2026 13:15
@myrrc myrrc added the changelog/chore A trivial change label May 1, 2026
@myrrc myrrc requested a review from 0ax1 May 1, 2026 13:15
@myrrc myrrc enabled auto-merge (squash) May 1, 2026 13:16
@myrrc myrrc marked this pull request as draft May 1, 2026 13:20
auto-merge was automatically disabled May 1, 2026 13:20

Pull request was converted to draft

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 1, 2026

Merging this PR will degrade performance by 33.33%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 6 improved benchmarks
❌ 5 regressed benchmarks
✅ 1187 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime dynamic_dispatch_u32[10M] 106.5 µs 159.7 µs -33.33%
Simulation decompress_rd[f32, (10000, 0.1)] 90.2 µs 82 µs +10%
Simulation decompress_rd[f64, (100000, 0.01)] 842.6 µs 1,020.5 µs -17.44%
Simulation decompress_rd[f64, (10000, 0.1)] 138.7 µs 122 µs +13.63%
Simulation decompress_rd[f64, (10000, 0.01)] 138.6 µs 121.9 µs +13.7%
Simulation decompress_rd[f64, (10000, 0.0)] 138.5 µs 122.1 µs +13.46%
Simulation decompress_rd[f32, (100000, 0.0)] 583.5 µs 495.7 µs +17.7%
Simulation decompress_rd[f32, (100000, 0.01)] 495.1 µs 582.7 µs -15.04%
Simulation decompress_rd[f64, (100000, 0.1)] 842.5 µs 1,020.7 µs -17.46%
Simulation decompress_rd[f32, (100000, 0.1)] 495.1 µs 582.7 µs -15.04%
Simulation bitwise_not_vortex_buffer_mut[128] 275.3 ns 246.1 ns +11.85%

Comparing myrrc/duckdb-no-reload (849b787) with develop (c4feed7)

Open in CodSpeed

# Enable compiler warnings (matching build.rs flags).
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -Wpedantic")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -Wpedantic -Werror")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -Wall -Wextra -Wpedantic -Werror -O3")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the release flags in this PR? Also should release just expand on default flags? -O0 will be overridden by -O3 etc.

Comment thread vortex-duckdb/build.rs
fn cpp(duckdb_include_dir: &Path, build_type: &str) {
let mut flags = vec!["-Wall", "-Wextra", "-Wpedantic", "-Werror"];
if build_type == "release" {
flags.extend_from_slice(&["-flto=auto"]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split out the LTO changes from this PR? Does enabling LTO for the C++ code in vortex-duckdb even have an effect on perf? Other than that, my idea was that we enable LTO in the ext repo but not in the vortex repo itself.

Comment thread vortex-duckdb/build.rs
fn cpp(duckdb_include_dir: &Path, build_type: &str) {
let mut flags = vec!["-Wall", "-Wextra", "-Wpedantic", "-Werror"];
if build_type == "release" {
flags.extend_from_slice(&["-flto=auto"]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto means full lto for clang which is very slow. we should try to use thin for both gcc and clang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants