Skip to content

Conversation

@tianleiwu
Copy link
Contributor

Description:

Provide flexible for testing different attention mask (like sliding window in longformer).

Also move some kernel API from attention_impl.cu to attention_impl.h, so that it could be used by Longformer attention operator in the future.

Motivation and Context

  • Why is this change required? What problem does it solve?
  • If it fixes an open issue, please link to the issue here.

@tianleiwu tianleiwu requested a review from a team as a code owner November 21, 2020 00:28
wangyems
wangyems previously approved these changes Nov 21, 2020
@tianleiwu tianleiwu merged commit 910bbfe into master Nov 21, 2020
@tianleiwu tianleiwu deleted the tlwu/3d_attention_mask branch November 21, 2020 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants