-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Pull requests: NVIDIA/cutlass
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CuteDSL][examples] Extend hopper dense_gemm for bf16
#2469
opened Jul 15, 2025 by
pashu-cohere
Loading…
Fix: Dangerous Code Execution Function Could Allow External Attacks in python/CuTeDSL/base_dsl/typing.py
#2465
opened Jul 14, 2025 by
kira-offgrid
Loading…
Mixed Precision Grouped Gemm with zero points and GPT-Q semantics closes #2261
#2457
opened Jul 11, 2025 by
ankutalev
Loading…
Fix epilogue::thread::Convert cannot be used with DefaultEpilogue
inactive-30d
#2333
opened May 26, 2025 by
solrex
Loading…
Add SM80/89 blockwise scaling kernel, support FP8 block/groupwise on Ada, INT8 on Ampere
#2328
opened May 24, 2025 by
solrex
Loading…
Ensure compatibility with "-Wimplicit-fallthrough" when compiling with Clang
#2324
opened May 22, 2025 by
wenxin0319
Loading…
More generic interface for Group GEMM problem size
inactive-30d
#2318
opened May 20, 2025 by
nandor
Loading…
Support N={48, 80, 96, 112, ...} for SM100 EpilogueTileAuto
inactive-30d
#2269
opened Apr 29, 2025 by
Algy
Loading…
Limit the number of SMs (sm_count) to user-provided value during profiling.
inactive-30d
#2257
opened Apr 22, 2025 by
manishucsd
Loading…
Adding Blackwell support for distributed GEMM.
inactive-30d
#2179
opened Mar 16, 2025 by
whatdhack
Loading…
Fix CUTE_DEVICE for cast_smem_ptr_to_unit
inactive-30d
#2171
opened Mar 13, 2025 by
monellz
Loading…
Fix sm100 gemm wrong static constexpr that breaks compilation on Windows
inactive-30d
#2167
opened Mar 13, 2025 by
SystemPanic
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.