Problem
native/ops/nn/nn.cu is 2673 lines with mixed concerns:
- Activations (GELU, SiLU, Sigmoid, Tanh)
- Normalization (LayerNorm, RMSNorm)
- Attention (SDPA)
- Position encoding (RoPE)
- Tensor ops (Transpose, Bias Add)
- Linear layer
This should be split to match the binding structure in #131.
Current Structure
native/ops/nn/
├── nn.cu (2673 lines - everything)
├── attention_kernels.cuh
├── elementwise_kernels.cuh
├── flash_attention.cuh
├── memory_kernels.cuh
└── norm_kernels.cuh
Proposed Structure
native/ops/nn/
├── activation/
│ ├── gelu.cu
│ ├── silu.cu
│ ├── sigmoid.cu
│ ├── tanh.cu
│ └── relu.cu
├── norm/
│ ├── layernorm.cu
│ └── rmsnorm.cu
├── attention/
│ ├── sdpa_causal.cu
│ ├── sdpa_fixed_cache.cu
│ └── flash_attention.cuh
├── rope/
│ └── rope_inplace.cu
├── linear/
│ └── linear_bias.cu
└── common/
├── attention_kernels.cuh
├── elementwise_kernels.cuh
├── memory_kernels.cuh
└── norm_kernels.cuh
Contents Mapping
| Current Section |
Lines (est.) |
New Location |
| GELU |
~50 |
activation/gelu.cu |
| SiLU |
~60 |
activation/silu.cu |
| Sigmoid |
~60 |
activation/sigmoid.cu |
| Tanh |
~60 |
activation/tanh.cu |
| LayerNorm |
~150 |
norm/layernorm.cu |
| RMSNorm |
~100 |
norm/rmsnorm.cu |
| SDPA Causal |
~400 |
attention/sdpa_causal.cu |
| SDPA Fixed Cache |
~300 |
attention/sdpa_fixed_cache.cu |
| RoPE |
~200 |
rope/rope_inplace.cu |
| Linear + Bias |
~100 |
linear/linear_bias.cu |
| Transpose (3D/4D) |
~200 |
Move to tensor/ |
| Split QKV |
~100 |
Move to tensor/ |
Notes
- Transpose operations should move to
native/ops/tensor/
- Split QKV should move to
native/ops/tensor/
- Each file should be <300 lines
- Shared kernels stay in
common/
Benefits
Related
Problem
native/ops/nn/nn.cuis 2673 lines with mixed concerns:This should be split to match the binding structure in #131.
Current Structure
Proposed Structure
Contents Mapping
activation/gelu.cuactivation/silu.cuactivation/sigmoid.cuactivation/tanh.cunorm/layernorm.cunorm/rmsnorm.cuattention/sdpa_causal.cuattention/sdpa_fixed_cache.curope/rope_inplace.culinear/linear_bias.cutensor/tensor/Notes
native/ops/tensor/native/ops/tensor/common/Benefits
Related