Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paddle-Inference]support GQA in variable_length_memory_efficient_attention #58836

Merged

Conversation

zhoutianzi666
Copy link
Contributor

@zhoutianzi666 zhoutianzi666 commented Nov 9, 2023

PR types

Others

PR changes

Others

Description

Pcard-71500

  • variable_length_memory_efficient_attention 支持GQA,不影响API层面

@zhoutianzi666 zhoutianzi666 changed the title support GQA in [Paddle-Inference]support GQA in variable_length_memory_efficient_attention Nov 9, 2023
Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no docs changes. LGTM

Copy link
Contributor

@yuanlehome yuanlehome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhoutianzi666 zhoutianzi666 merged commit f8137fb into PaddlePaddle:develop Nov 9, 2023
27 of 28 checks passed
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
…ention (PaddlePaddle#58836)

[Paddle-Inference]support GQA in variable_length_memory_efficient_attention (PaddlePaddle#58836)
SecretXV pushed a commit to SecretXV/Paddle that referenced this pull request Nov 28, 2023
…ention (PaddlePaddle#58836)

[Paddle-Inference]support GQA in variable_length_memory_efficient_attention (PaddlePaddle#58836)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants