Skip to content

Paran0idy/Kernel

Repository files navigation

Kernel

CUDA

SGEMM

✅ Naive

✅ Thread Tiling

✅ Thread Tiling Bank Free

SGEMM

SGEMV

✅ Naive

✅ Warp Reduce

Reduce

Transpose

Sort

✅ MergeSort

Softmax

✅ Naive

✅ WarpReduce

Triton

HGEMM

✅ Block Tiling

HGEMM

GEMV

Reduce

Transpose

Sort

Softmax

Build

python3 ./script.py {kernelName}

Dependence

  • NVIDIA GPU
  • OpenAI Triton >= 2.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published