-
Notifications
You must be signed in to change notification settings - Fork 13.5k
sycl: add RMS_NORM_BACK operation support #16808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…epoint before further changes
Implement RMS_NORM_BACK for the SYCL backend using FP32 compensated parallel reduction. Minimal docs updates (ops.md / SYCL.csv).
|
This PR is ready for review. |
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good job!
Thank you!
|
@YaelLogic |
|
Hi @NeoZhangJianyu, |
Summary
Add SYCL backend support for
RMS_NORM_BACKusing a single FP32 compensated parallel reduction path.No changes to the public API. Default numerical accuracy is preserved; a fast opt-in macro is also available.
Implementation
Algorithm (onsistent with existing backend behavior)
What was implemented
Σ x²andΣ x·dzwith Kahan-style compensation.sub_group) reduction viawarp_reduce_sum.group_broadcastused to distributeinv_randcoeffacross the work-group.WARP_SIZE, capped by device limit (≤256), not larger thanD.Optional fast path
GGML_SYCL_RMS_BACK_FASTto disable compensated summation and use plain FP32 accumulation.Validation
Focused tests executed locally:
Build is warning-free for this code path.
Reproduce (build + test)
Files Changed (minimal scope only)
ggml/src/ggml-sycl/norm.cppggml_sycl_op_rms_norm_backggml/src/ggml-sycl/ggml-sycl.cppggml/src/ggml-sycl/norm.hppdocs/ops.mddocs/ops/SYCL.csvNo unrelated files or personal data included.
Notes & Risks
Reviewers
cc @CISC @NeoZhangJianyu
Looking forward to your feedback. Thanks in advance!