Add attention sink support for FMHA FWD by LJ-underdog · Pull Request #3368 · ROCm/composable_kernel

LJ-underdog · 2025-12-08T06:54:41Z

Proposed changes

This is the second attempt to introduce attention sink. The first attempt (#2892) was reverted by (#3250)

Key changes:

Added kHasSink boolean parameter throughout the trait hierarchy and pipeline implementations
Updated masking logic to support sink-aware bounds checking via GetSinkTileRangeAlongX and IsOutOfSinkBound methods
Modified loop calculations in pipelines to handle sink regions separately from regular attention regions
Updated code generation scripts to emit sink-enabled kernel variants

Example mask window_size[2,0], sink_size = 2

 x=1/y=3                 
 1 * * * * * * *           1 * * * * * * *  
 1 1 * * * * * *           1 1 * * * * * *
 1 1 1 * * * * *   ---->   1 1 1 * * * * * 
 * 1 1 1 * * * *           1 1 1 1 * * * * 
 * * 1 1 1 * * *           1 1 1 1 1 * * * 
 * * * 1 1 1 * *           1 1 * 1 1 1 * * 
 * * * * 1 1 1 *           1 1 * * 1 1 1 *
 * * * * * 1 1 1           1 1 * * * 1 1 1
 l=2/r=0(tl)               l=2/r=0/s=2(tl)

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

I have added tests relevant to the introduced functionality, and the unit tests are passing locally
I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
I have added inline documentation which enables the maintainers with understanding the motivation
I have removed the stale documentation which is no longer relevant after this pull request
(If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
I have run clang-format on all changed files
Any dependent changes have been merged

This reverts commit 5adaa20.

Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>

Updated the pipeline creation logic to include 'sink' parameter in product combinations and adjusted the FmhaFwdPipeline calls accordingly.

Copilot

Pull request overview

This PR reverts a previous revert, re-introducing the "attention sink" feature originally added in PR #2892. The attention sink mechanism allows certain tokens at the beginning of the sequence to always be attended to, regardless of the attention mask pattern.

Key changes:

Added kHasSink boolean parameter throughout the trait hierarchy and pipeline implementations
Updated masking logic to support sink-aware bounds checking via GetSinkTileRangeAlongX and IsOutOfSinkBound methods
Modified loop calculations in pipelines to handle sink regions separately from regular attention regions
Updated code generation scripts to emit sink-enabled kernel variants

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`include/ck_tile/ops/fmha/pipeline/tile_fmha_traits.hpp`	Added `kHasSink` parameter to trait templates
`include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_*.hpp`	Updated pipelines to compute sink loop counts and adjust window offsets
`include/ck_tile/ops/fmha/kernel/fmha_*_kernel.hpp`	Added `sink_size` to kernel argument structures and kernel name generation
`include/ck_tile/ops/fmha/block/block_masking.hpp`	Implemented sink-aware masking methods (`GetSinkTileRangeAlongX`, `IsOutOfSinkBound`)
`include/ck_tile/ops/fmha/block/variants.hpp`	Added `LogitsSinkMask` methods to attention variants
`example/ck_tile/01_fmha/fmha_fwd*.hpp`	Added sink parameters to trait structures and argument passing
`example/ck_tile/01_fmha/codegen/ops/*.py`	Updated code generation to produce sink-enabled kernel instances
`example/ck_tile/01_fmha/script/*.sh`	Added new test scripts for sink functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

example/ck_tile/01_fmha/script/correct_test_fwd_sink.sh

illsilin · 2025-12-09T15:40:12Z

Are the relevant changes ready on the AITER side?
Will this also require changes in the Tri Dao's flash-attention repo?

LJ-underdog · 2025-12-10T02:00:08Z

Are the relevant changes ready on the AITER side? Will this also require changes in the Tri Dao's flash-attention repo?

aiter pr: ROCm/aiter#1272

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>

Revert "Revert "Add attn sink (#2892)" (#3250)"

fc03494

This reverts commit 5adaa20.

LJ-underdog requested review from ThomasNing, afagaj, andriy-ca, aosewski, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, geyyer, illsilin, poyenc, qianfengz, shumway, tenpercent and vidyasagar-amd as code owners December 8, 2025 06:54

LJ-underdog added 3 commits December 8, 2025 01:02

fix conflict

8b7bf45

Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>

Add F_sink parameter to FmhaFwdPipeline

a4a2de2

Update tile_fmha_traits.hpp

7fc9dbf

LJ-underdog mentioned this pull request Dec 9, 2025

Add sink_size parameter for mha_fwd and varlen_mha_fwd api ROCm/aiter#1272

Merged

1 task

LJ-underdog added 4 commits December 9, 2025 16:21

Merge branch 'develop' into lj_attn_sink

bf02a27

Refactor pipeline creation in fmha_fwd.py

0225043

Updated the pipeline creation logic to include 'sink' parameter in product combinations and adjusted the FmhaFwdPipeline calls accordingly.

Update fmha_fwd.py

0134ecb

Update fmha_fwd.py

6bd215d

poyenc requested a review from Copilot December 9, 2025 10:33

Copilot started reviewing on behalf of poyenc December 9, 2025 10:34 View session

Copilot AI reviewed Dec 9, 2025

View reviewed changes

example/ck_tile/01_fmha/script/correct_test_fwd_sink.sh Outdated Show resolved Hide resolved

LJ-underdog changed the title ~~Revert "Revert "Add attn sink (#2892)" (#3250)"~~ Add attention sink support for FMHA FWD Dec 10, 2025

LJ-underdog and others added 3 commits December 10, 2025 10:20

Update example/ck_tile/01_fmha/script/correct_test_fwd_sink.sh

39ee908

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'develop' into lj_attn_sink

ff07d48

update CHANGELOG.md

8d2c9e7

Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>

LJ-underdog requested review from a team and ddembeckAMD as code owners December 10, 2025 02:44

LJ-underdog added 4 commits December 10, 2025 11:22

Update CHANGELOG with new features and support

51c90b2

Update fmha_fwd.hpp

fa2f330

Update CHANGELOG.md

38f980e

Merge branch 'develop' into lj_attn_sink

0445bbd

poyenc previously approved these changes Dec 12, 2025

View reviewed changes

LJ-underdog added 2 commits December 12, 2025 14:57

Merge branch 'develop' into lj_attn_sink

cbb1510

Update smoke_test_fwd_sink.sh

8a4ceff

LJ-underdog dismissed poyenc’s stale review via 8a4ceff December 12, 2025 07:02

LJ-underdog added 2 commits December 12, 2025 15:06

Update correct_test_fwd_sink.sh

3b87f3b

Update smoke_test_fwd_sink.sh

c6680a5

spolifroni-amd approved these changes Dec 12, 2025

View reviewed changes

yuguo68 mentioned this pull request Dec 12, 2025

[ROCm] Add specific compile options for CK SDPA pytorch/pytorch#161759

Closed

Merge branch 'develop' into lj_attn_sink

37627a8

poyenc approved these changes Dec 14, 2025

View reviewed changes

LJ-underdog merged commit f5573f5 into develop Dec 15, 2025
31 of 37 checks passed

LJ-underdog deleted the lj_attn_sink branch December 15, 2025 04:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add attention sink support for FMHA FWD#3368

Add attention sink support for FMHA FWD#3368
LJ-underdog merged 20 commits intodevelopfrom
lj_attn_sink

LJ-underdog commented Dec 8, 2025 •

edited by afagaj

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

illsilin commented Dec 9, 2025

Uh oh!

LJ-underdog commented Dec 10, 2025 •

edited by DDEle

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

LJ-underdog commented Dec 8, 2025 • edited by afagaj Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Example mask window_size[2,0], sink_size = 2

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

illsilin commented Dec 9, 2025

Uh oh!

LJ-underdog commented Dec 10, 2025 • edited by DDEle Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LJ-underdog commented Dec 8, 2025 •

edited by afagaj

Loading

LJ-underdog commented Dec 10, 2025 •

edited by DDEle

Loading