Skip to content

refactor(native): Split nn.cu into modular files (aligned with #131) #133

@m96-chan

Description

@m96-chan

Problem

native/ops/nn/nn.cu is 2673 lines with mixed concerns:

  • Activations (GELU, SiLU, Sigmoid, Tanh)
  • Normalization (LayerNorm, RMSNorm)
  • Attention (SDPA)
  • Position encoding (RoPE)
  • Tensor ops (Transpose, Bias Add)
  • Linear layer

This should be split to match the binding structure in #131.

Current Structure

native/ops/nn/
├── nn.cu                    (2673 lines - everything)
├── attention_kernels.cuh
├── elementwise_kernels.cuh
├── flash_attention.cuh
├── memory_kernels.cuh
└── norm_kernels.cuh

Proposed Structure

native/ops/nn/
├── activation/
│   ├── gelu.cu
│   ├── silu.cu
│   ├── sigmoid.cu
│   ├── tanh.cu
│   └── relu.cu
├── norm/
│   ├── layernorm.cu
│   └── rmsnorm.cu
├── attention/
│   ├── sdpa_causal.cu
│   ├── sdpa_fixed_cache.cu
│   └── flash_attention.cuh
├── rope/
│   └── rope_inplace.cu
├── linear/
│   └── linear_bias.cu
└── common/
    ├── attention_kernels.cuh
    ├── elementwise_kernels.cuh
    ├── memory_kernels.cuh
    └── norm_kernels.cuh

Contents Mapping

Current Section Lines (est.) New Location
GELU ~50 activation/gelu.cu
SiLU ~60 activation/silu.cu
Sigmoid ~60 activation/sigmoid.cu
Tanh ~60 activation/tanh.cu
LayerNorm ~150 norm/layernorm.cu
RMSNorm ~100 norm/rmsnorm.cu
SDPA Causal ~400 attention/sdpa_causal.cu
SDPA Fixed Cache ~300 attention/sdpa_fixed_cache.cu
RoPE ~200 rope/rope_inplace.cu
Linear + Bias ~100 linear/linear_bias.cu
Transpose (3D/4D) ~200 Move to tensor/
Split QKV ~100 Move to tensor/

Notes

  • Transpose operations should move to native/ops/tensor/
  • Split QKV should move to native/ops/tensor/
  • Each file should be <300 lines
  • Shared kernels stay in common/

Benefits

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions