-
Notifications
You must be signed in to change notification settings - Fork 660
[xpu] add ep custom ops #3911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[xpu] add ep custom ops #3911
Conversation
|
Thanks for your contribution! |
48c26b4 to
61ba786
Compare
1cd3111 to
e754168
Compare
hong19860320
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
custom_ops/setup_ops.py
Outdated
| ) | ||
| elif paddle.is_compiled_with_xpu(): | ||
| assert False, "In XPU, we should use setup_ops.py in xpu_ops/src, not this." | ||
| assert False, "In XPU, we should use setup_ops.py in xpu_ops/, not this." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For XPU, please use setup_ops.py in the xpu_ops directory to compile custom ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
01cfebf
e754168 to
01cfebf
Compare
hong19860320
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
XiaoguangHu01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPU版本算子接口有差异,和业界实现的接口有差异,算子的定义应该是设备无关的。可以先合入,后续需要改进。
| return {x_dtype, x_dtype}; | ||
| } | ||
|
|
||
| PD_BUILD_STATIC_OP(fused_rms_norm_xpu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么需要给xpu单独写一个fused_rms_norm算子呢?和gpu版本的算子有什么区别?
kernel实现可以是硬件相关的,但是算子定义应该是硬件无关的。
No description provided.