Skip to content

cudarc use 12.5 instead of 12.8#6660

Merged
a10y merged 1 commit intodevelopfrom
aduffy/cuda-downgrade-feat
Feb 24, 2026
Merged

cudarc use 12.5 instead of 12.8#6660
a10y merged 1 commit intodevelopfrom
aduffy/cuda-downgrade-feat

Conversation

@a10y
Copy link
Contributor

@a10y a10y commented Feb 24, 2026

When running on lambda.ai boxes, we have CUDA Toolkit 12.8 installed. The NSight Compute version that they ship does not seem to inject libs that include several of the symbols cudarc marks as available on 12.8, for example:

(base) ubuntu@132-145-211-165:~/vortex$ RUST_LOG=vortex_cuda=trace,info FLAT_LAYOUT_INLINE_ARRAY_NODE=true ncu ./target/samply/gpu-scan-cli ./vortex-bench/data/tpch/1.0/vortex-file-compressed/lineitem_0.vortex

thread 'main' panicked at /home/ubuntu/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.18.2/src/driver/sys/mod.rs:24528:18:
Expected symbol in library: DlSym { desc: "/usr/lib/x86_64-linux-gnu/nsight-compute/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so: undefined symbol: cuTensorMapEncodeIm2colWide" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==ERROR== The application returned an error code (101).

cudarc eagerly loads all symbols it believes should be available at startup, even if they aren't used by the program.

We don't use any recently added CUDA symbols, so to keep this compatible with NSight Compute execution, we should downgrade cudarc to build with only the CUDA 12.5 symbols.

This runs without issue.

image

Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y requested a review from 0ax1 February 24, 2026 16:01
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 24, 2026

Merging this PR will improve performance by 11.36%

⚡ 1 improved benchmark
✅ 953 untouched benchmarks
⏩ 1466 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation map_each[BufferMut<i32>, 128] 858.1 ns 770.6 ns +11.36%

Comparing aduffy/cuda-downgrade-feat (51fa515) with develop (c018251)

Open in CodSpeed

Footnotes

  1. 1466 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@0ax1 0ax1 added the changelog/chore A trivial change label Feb 24, 2026
@a10y a10y merged commit e7b81e9 into develop Feb 24, 2026
52 of 53 checks passed
@a10y a10y deleted the aduffy/cuda-downgrade-feat branch February 24, 2026 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants