Skip to content

Conversation

@zhupengyang
Copy link
Collaborator

No description provided.

@paddle-bot
Copy link

paddle-bot bot commented Sep 5, 2025

Thanks for your contribution!

@zhupengyang zhupengyang force-pushed the custom_ops branch 3 times, most recently from 48c26b4 to 61ba786 Compare September 5, 2025 05:45
@zhupengyang zhupengyang changed the title [xpu] change custom ops dir [xpu] add ep custom ops; change custom ops dir Sep 5, 2025
@zhupengyang zhupengyang force-pushed the custom_ops branch 2 times, most recently from 1cd3111 to e754168 Compare September 5, 2025 10:52
@zhupengyang zhupengyang changed the title [xpu] add ep custom ops; change custom ops dir [xpu] add ep custom ops Sep 8, 2025
hong19860320
hong19860320 previously approved these changes Sep 8, 2025
Copy link
Collaborator

@hong19860320 hong19860320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Jiang-Jia-Jun
Jiang-Jia-Jun previously approved these changes Sep 8, 2025
)
elif paddle.is_compiled_with_xpu():
assert False, "In XPU, we should use setup_ops.py in xpu_ops/src, not this."
assert False, "In XPU, we should use setup_ops.py in xpu_ops/, not this."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For XPU, please use setup_ops.py in the xpu_ops directory to compile custom ops.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@hong19860320 hong19860320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPU版本算子接口有差异,和业界实现的接口有差异,算子的定义应该是设备无关的。可以先合入,后续需要改进。

return {x_dtype, x_dtype};
}

PD_BUILD_STATIC_OP(fused_rms_norm_xpu)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么需要给xpu单独写一个fused_rms_norm算子呢?和gpu版本的算子有什么区别?
kernel实现可以是硬件相关的,但是算子定义应该是硬件无关的。

@zhupengyang zhupengyang merged commit 9d0074a into PaddlePaddle:develop Sep 10, 2025
25 of 30 checks passed
@zhupengyang zhupengyang deleted the custom_ops branch September 10, 2025 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants