Skip to content

Missing CUDA error checks after kernel launch + potential underflow in sub_seq_len boundary #30

@consigcody94

Description

@consigcody94

Summary

Found via code audit of the CUDA kernel code.

Bug 1: Missing CUDA error check after kernel launch (MEDIUM)

Files:

  • `csrc/kda/sm100/kda_fwd_intra_kernel_sm100.hpp`, line 372
  • `csrc/kda/sm100/kda_fwd_recomp_w_u_kernel_sm100.hpp`, line 420

Both SM100 launchers have `CHECK_CUDA(cudaFuncSetAttribute(...))` before launch but no error check after `kernel_fn<<<...>>>()`. The repo defines `CHECK_CUDA_KERNEL_LAUNCH()` in `kerutils/common/common.h` but never uses it.

Fix: Add `CHECK_CUDA_KERNEL_LAUNCH();` after each kernel launch.

Bug 2: Potential underflow when `sub_seq_len == 0` (MEDIUM)

File: `csrc/kda/sm100/fwd_helpers.hpp`, lines 92-95 and throughout

`min(X, sub_seq_len - 1)` underflows to -1 (or INT_MAX unsigned) when `sub_seq_len == 0`, causing OOB SMEM access.

Fix: Add early return guard when `sub_seq_len <= 0`.

Bug 3: `exit(1)` in CHECK_CUDA macro kills Python process (LOW)

File: `csrc/kerutils/include/kerutils/common/common.h`, line 29

Should throw `KUException` instead of calling `exit(1)` to allow Python exception handling.

Bug 4: Hardcoded include path in setup.py (LOW)

File: `setup.py`, line 200

`/usr/local/cuda/include/cccl` is hardcoded. Should use `CUDA_HOME`.

Found via code audit.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions