Skip to content

Split-K reduction kernel cleanup #21

Merged
xjmxyt merged 3 commits intoNVIDIA:mainfrom
lessw2020:lessw2020/splitk_reduction_cleanup
Dec 30, 2025
Merged

Split-K reduction kernel cleanup #21
xjmxyt merged 3 commits intoNVIDIA:mainfrom
lessw2020:lessw2020/splitk_reduction_cleanup

Conversation

@lessw2020
Copy link
Copy Markdown
Contributor

@lessw2020 lessw2020 commented Dec 28, 2025

Description

Currently this kernel is obtaining and passing in a number of strides (Triton style). Yet, these strides are not used. The kernel uses TMA and thus does not need it.
This PR simply removes the stride extraction and passing in of these un-used arguments to help showcase a cleaner CuTile kernel.

stride_att_b: ConstInt,
    stride_att_m: ConstInt,
    stride_att_s: ConstInt,
    stride_lse_rb: ConstInt,
    stride_lse_rm: ConstInt,
    stride_ob: ConstInt,
    stride_om: ConstInt,

Testing:
passes all 32 unit tests, same as before.

CI Configuration

config:
  build: true
  # valid options are "ops" and "benchmark"
  test: ["ops"]

Checklist

  • Code formatted and imports sorted via repo specifications (./format.sh)
  • Documentation updated (if needed)
  • CI configuration reviewed

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Dec 28, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@xjmxyt
Copy link
Copy Markdown
Collaborator

xjmxyt commented Dec 30, 2025

/ok to test e5c7ed9

@xjmxyt xjmxyt merged commit 722ebb5 into NVIDIA:main Dec 30, 2025
15 checks passed
@lessw2020 lessw2020 deleted the lessw2020/splitk_reduction_cleanup branch December 30, 2025 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants