Skip to content

Add cuda-oxide single-item DWT97 transcode kernels#39

Merged
jcwal1516 merged 1 commit into
mainfrom
impl/cuda-oxide-transcode-dwt97
Jun 22, 2026
Merged

Add cuda-oxide single-item DWT97 transcode kernels#39
jcwal1516 merged 1 commit into
mainfrom
impl/cuda-oxide-transcode-dwt97

Conversation

@jcwal1516

Copy link
Copy Markdown
Member

Summary

  • extend the cuda-oxide transcode PTX project with single-item irreversible 9/7 IDCT, row lift, and column lift kernels
  • route only reversible 5/3 and single-item irreversible 9/7 transcode through cuda-oxide when the feature/env gate is enabled
  • keep batched and fused 9/7 transcode kernels on the existing CUDA C PTX path for a later focused slice
  • update env var and unsafe audit docs plus strict GPU parity coverage

Validation

Local:

  • cargo fmt --check
  • cargo fmt --check --manifest-path crates/j2k-cuda-runtime/src/cuda_oxide_transcode/simt/Cargo.toml
  • git diff --check
  • cargo test -p j2k-cuda-runtime --all-targets --features cuda-oxide-transcode
  • cargo clippy -p j2k-cuda-runtime --all-targets --features cuda-oxide-transcode -- -D warnings
  • cargo test -p xtask --test repo_lint
  • cargo xtask unsafe-audit

CUDA host jcwal@100.75.125.59:

  • cargo oxide build --arch sm_89
  • strict transcode runtime with J2K_REQUIRE_CUDA_OXIDE_TRANSCODE=1 and J2K_CUDA_USE_OXIDE_TRANSCODE=1
  • strict combined cuda-oxide runtime for copy, encode, decode-store, dequantize, idwt, and transcode
  • cargo +nightly clippy --manifest-path crates/j2k-cuda-runtime/src/cuda_oxide_transcode/simt/Cargo.toml -- -D warnings

@jcwal1516 jcwal1516 merged commit 4b32b8b into main Jun 22, 2026
61 of 62 checks passed
@jcwal1516 jcwal1516 deleted the impl/cuda-oxide-transcode-dwt97 branch June 22, 2026 06:19
jcwal1516 added a commit that referenced this pull request Jun 29, 2026
Add cuda-oxide single-item DWT97 transcode kernels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant