Skip to content

Conversation

kareemshaik80
Copy link

  • Support for single sink logit in flash attention Decode
  • Add Sink to Softmax
  • Cmd line flag added to enable attention sink

 - Support for single sink logit in flash attention Decode
 - Add Sink to Softmax
 - Cmd line flag added to enable attention sink

Signed-off-by: kareem <kshaik@habana.ai>
@kareemshaik80 kareemshaik80 marked this pull request as draft September 25, 2025 07:38
@kareemshaik80 kareemshaik80 marked this pull request as ready for review September 25, 2025 07:39
Copy link

@yuankuns yuankuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need paper/code reference to ensure this PR is what intended to do

Copy link

@yuankuns yuankuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not changed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants