NVIDIA / cutlass Public

Notifications
Fork 1.3k
Star 8.1k

Code
Issues 322
Pull requests 53
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: NVIDIA/cutlass

Labels 23 Milestones 3

New pull request New

53 Open 636 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[CuteDSL][examples] Extend hopper dense_gemm for bf16

#2469 opened Jul 15, 2025 by pashu-cohere

Loading…

Example 77 add blackwell fmha bwd for MLA shape

#2466 opened Jul 14, 2025 by uchihatmtkinu

Loading…

Fix: Dangerous Code Execution Function Could Allow External Attacks in python/CuTeDSL/base_dsl/typing.py

#2465 opened Jul 14, 2025 by kira-offgrid

Loading…

Fix/redundant stages param

#2462 opened Jul 13, 2025 by Mirza-Samad-Ahmed-Baig

Loading…

Mixed Precision Grouped Gemm with zero points and GPT-Q semantics closes #2261

#2457 opened Jul 11, 2025 by ankutalev

Loading…

Fix tutorial comment in sgemm_1.cu: use tCrC instead of tCsA in axpby…

#2448 opened Jul 8, 2025 by kernyan • Draft

Corrected minor nit in mma_traits.hpp

#2447 opened Jul 8, 2025 by AdityaKane2001

Loading…

Fix example in CuTe tutorials

#2416 opened Jun 24, 2025 by lw

Loading…

Make unittest compilation faster

#2402 opened Jun 14, 2025 by andralex

Loading…

Update 02_layout_algebra.md inactive-30d

#2385 opened Jun 9, 2025 by JimpleM

Loading…

support fp16 accmulator for sm89 fp8 mma

#2378 opened Jun 7, 2025 by kf-zhang

Loading…

Fix sgemm_sm80 example bug inactive-30d

#2351 opened May 30, 2025 by botbw

Loading…

Fix epilogue::thread::Convert cannot be used with DefaultEpilogue inactive-30d

#2333 opened May 26, 2025 by solrex

Loading…

Fix typos in multiple files inactive-30d

#2330 opened May 24, 2025 by co63oc

Loading…

Fix typos in multiple files inactive-30d

#2329 opened May 24, 2025 by co63oc

Loading…

Add SM80/89 blockwise scaling kernel, support FP8 block/groupwise on Ada, INT8 on Ampere

#2328 opened May 24, 2025 by solrex

Loading…

Ensure compatibility with "-Wimplicit-fallthrough" when compiling with Clang

#2324 opened May 22, 2025 by wenxin0319

Loading…

More generic interface for Group GEMM problem size inactive-30d

#2318 opened May 20, 2025 by nandor

Loading…

bwl1289/fix/cmake-build-fixes

#2305 opened May 15, 2025 by BwL1289

Loading…

Support N={48, 80, 96, 112, ...} for SM100 EpilogueTileAuto inactive-30d

#2269 opened Apr 29, 2025 by Algy

Loading…

Limit the number of SMs (sm_count) to user-provided value during profiling. inactive-30d

#2257 opened Apr 22, 2025 by manishucsd

Loading…

MLA example page cache fix inactive-30d

#2221 opened Apr 4, 2025 by divchenko

Loading…

Adding Blackwell support for distributed GEMM. inactive-30d

#2179 opened Mar 16, 2025 by whatdhack

Loading…

Fix CUTE_DEVICE for cast_smem_ptr_to_unit inactive-30d

#2171 opened Mar 13, 2025 by monellz

Loading…

Fix sm100 gemm wrong static constexpr that breaks compilation on Windows inactive-30d

#2167 opened Mar 13, 2025 by SystemPanic

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!