Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save memory from interleaved attention #1151

Closed
wants to merge 3 commits into from
Closed

Conversation

Ying1123
Copy link
Member

No description provided.

@Ying1123 Ying1123 marked this pull request as draft August 19, 2024 10:26
@Ying1123 Ying1123 mentioned this pull request Aug 19, 2024
29 tasks
@Ying1123 Ying1123 changed the title Save memory from window attention Save memory from interleaved attention Aug 19, 2024
@Ying1123 Ying1123 force-pushed the ying-window-save-mem branch 2 times, most recently from feda9b9 to 1404d94 Compare August 19, 2024 22:01
@Ying1123 Ying1123 force-pushed the ying-window-save-mem branch 4 times, most recently from bc787b2 to 2d440b3 Compare August 20, 2024 19:00
@yzh119
Copy link
Collaborator

yzh119 commented Aug 21, 2024

Is interleaved attention the gemma-2 style attention (one layer full-attention followed by another layer of sliding window attention)?

@merrymercy
Copy link
Contributor

@yzh119 Yes, this is for gemma

@Ying1123
Copy link
Member Author

Ying1123 commented Sep 14, 2024

Close it for now. It should come up after the memory pool refactor.

@Ying1123 Ying1123 closed this Sep 14, 2024
@Ying1123 Ying1123 deleted the ying-window-save-mem branch September 14, 2024 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants