Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: attention bug with large values #158

Merged
merged 1 commit into from
Nov 5, 2022

Conversation

gaetansnl
Copy link
Contributor

@gaetansnl gaetansnl commented Nov 4, 2022

reproduced pytorch nightly
needs in depth tests

@github-actions github-actions bot added fix hurrah, bug fixed! and removed fix hurrah, bug fixed! labels Nov 4, 2022
@github-actions github-actions bot added fix hurrah, bug fixed! and removed fix hurrah, bug fixed! labels Nov 4, 2022
Copy link
Member

@pommedeterresautee pommedeterresautee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
Seems to me we can simplify code, but need to do some checks.
Priority to security and precision, let's merge!

@pommedeterresautee pommedeterresautee merged commit 4677f6c into main Nov 5, 2022
@pommedeterresautee pommedeterresautee deleted the fix/attention-large-bug branch November 5, 2022 18:03
@pommedeterresautee
Copy link
Member

I am bit surprised that we can't remove (and related lines):

            if NEED_LOAD_MASK_SIZE_N:
                attention_load_mask = (n_row_offset + offs_n)[None, :] < size_n

as it seems redundant with the newly added check (made many tests, crash precision). Triton is mysterious sometimes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix hurrah, bug fixed!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants