Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about Instance-Masked Attention #6

Closed
jsg921019 opened this issue Feb 26, 2024 · 2 comments
Closed

question about Instance-Masked Attention #6

jsg921019 opened this issue Feb 26, 2024 · 2 comments

Comments

@jsg921019
Copy link

Thank you for sharing interesting work.

I have question about Instance-Masked Attention.
Current code does not seems to apply Instance-Masked Attention. (return_att_masks = False)
Is this because not applying Instance-Masked Attention is better in generation quality?

Secondly, is Instance-Masked Attention applied when training? Or is this only applied when inference?

Thank you in advance.

@frank-xwang
Copy link
Owner

Hi, thank you for expressing your interest. Currently, we have return_att_masks set to False as Flash Attention does not yet support attention masks (check it here). However, if speed and memory usage are not primary concerns for your application, you may opt to set return_att_masks to True. It's worth noting that during our learning process, we had this option enabled. Hope it helps!

@jsg921019
Copy link
Author

Thank you for precise and fast feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants