Skip to content

Add cuda-oxide J2K IDWT kernels#36

Merged
jcwal1516 merged 1 commit into
mainfrom
impl/cuda-oxide-j2k-idwt
Jun 22, 2026
Merged

Add cuda-oxide J2K IDWT kernels#36
jcwal1516 merged 1 commit into
mainfrom
impl/cuda-oxide-j2k-idwt

Conversation

@jcwal1516

Copy link
Copy Markdown
Member

Summary

  • add opt-in cuda-oxide J2K generic IDWT PTX build and runtime dispatch
  • port generic inverse-DWT entrypoints to cuda-oxide: interleave, horizontal/vertical 5/3 and 9/7, generic batched interleave+horizontal, generic batched vertical, and the legacy single-entry kernel
  • leave cooperative shared-memory IDWT kernels on built-in CUDA C PTX for a separate focused PR
  • document J2K_CUDA_USE_OXIDE_J2K_IDWT and J2K_REQUIRE_CUDA_OXIDE_J2K_IDWT

Validation

  • local: cargo test -p j2k-cuda-runtime --all-targets --features cuda-oxide-j2k-idwt
  • local: cargo clippy -p j2k-cuda-runtime --all-targets --features cuda-oxide-j2k-idwt -- -D warnings
  • local: cargo test -p xtask --test repo_lint
  • local: cargo xtask unsafe-audit
  • local: cargo fmt --check
  • local: cargo fmt --check --manifest-path crates/j2k-cuda-runtime/src/cuda_oxide_j2k_idwt/simt/Cargo.toml
  • local: git diff --check
  • CUDA host: standalone cargo oxide build --arch sm_89
  • CUDA host: strict runtime test with J2K_REQUIRE_CUDA_OXIDE_J2K_IDWT=1, 95 passed
  • CUDA host: combined strict runtime test with copy-u8, J2K encode, decode-store, dequantize, and IDWT cuda-oxide features required, 100 passed

@jcwal1516 jcwal1516 merged commit 869bc33 into main Jun 22, 2026
61 of 62 checks passed
@jcwal1516 jcwal1516 deleted the impl/cuda-oxide-j2k-idwt branch June 22, 2026 05:04
jcwal1516 added a commit that referenced this pull request Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant