Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arm] Fix fuse_attention support old_quant_format #10027

Conversation

sprouteer
Copy link
Collaborator

@sprouteer sprouteer commented Feb 24, 2023

PR devices

Arm

PR types

Bug fixes

PR changes

OP

Description

  1. Fix fuse_attention support old_quant_format, 旧量化格式下融合attention结构后,在前面插入calib算子的时候,需要去fused_attention op的attribute中找到scale
  2. 修复infer shape bug
  3. 支持reshape 和transpose 无xshape参数,支持dropout无mask输出,支持matmul或者matmul_v2
  4. 修复打开sve开关后编译报错问题

@sprouteer sprouteer force-pushed the fix_fuse_attention_support_old_quant_format branch from 2cd5f5a to 7bc2da2 Compare March 7, 2023 06:37
Copy link
Collaborator

@zhupengyang zhupengyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sprouteer sprouteer merged commit f3e367f into PaddlePaddle:develop Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants