-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[NVPTX] Add intrinsic range to nvvm_read_ptx_sreg_laneid #153099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-nvptx Author: Vedant Paranjape (VedantParanjape) Changesnvvm_read_ptx_sreg_laneid is used to represent the laneid of a thread running in a warp. Add a known range to the laneid intrinsic. Full diff: https://github.com/llvm/llvm-project/pull/153099.diff 2 Files Affected:
diff --git a/llvm/lib/Target/NVPTX/NVVMIntrRange.cpp b/llvm/lib/Target/NVPTX/NVVMIntrRange.cpp
index 2c81989932a97..3934c7afadafb 100644
--- a/llvm/lib/Target/NVPTX/NVVMIntrRange.cpp
+++ b/llvm/lib/Target/NVPTX/NVVMIntrRange.cpp
@@ -130,6 +130,10 @@ static bool runNVVMIntrRange(Function &F) {
if (OverallClusterRank)
return addRangeAttr(1, FunctionClusterRank + 1, II);
break;
+
+ // Lane ID
+ case Intrinsic::nvvm_read_ptx_sreg_laneid:
+ return addRangeAttr(0, 32, II);
default:
return false;
}
diff --git a/llvm/test/CodeGen/NVPTX/intr-range.ll b/llvm/test/CodeGen/NVPTX/intr-range.ll
index 48fa3e06629b4..fb9488e7e5ab8 100644
--- a/llvm/test/CodeGen/NVPTX/intr-range.ll
+++ b/llvm/test/CodeGen/NVPTX/intr-range.ll
@@ -135,6 +135,16 @@ define ptx_kernel i32 @test_cluster_dim() "nvvm.cluster_dim"="4,4,1" {
ret i32 %11
}
+define ptx_kernel i32 @test_laneid() "nvvm.cluster_dim"="4,4,1" {
+; CHECK-LABEL: define ptx_kernel i32 @test_laneid(
+; CHECK-SAME: ) #[[ATTR4]] {
+; CHECK-NEXT: [[TMP1:%.*]] = call range(i32 0, 32) i32 @llvm.nvvm.read.ptx.sreg.laneid()
+; CHECK-NEXT: ret i32 [[TMP1]]
+;
+ %1 = call i32 @llvm.nvvm.read.ptx.sreg.laneid()
+ ret i32 %1
+}
+
; DEFAULT-DAG: declare noundef range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.x()
; DEFAULT-DAG: declare noundef range(i32 0, 1024) i32 @llvm.nvvm.read.ptx.sreg.tid.y()
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
nvvm_read_ptx_sreg_laneid is used to represent the laneid of a thread running in a warp. Add a known range to the laneid intrinsic.
7becef7
to
11d8605
Compare
// Lane ID | ||
case Intrinsic::nvvm_read_ptx_sreg_laneid: | ||
return addRangeAttr(0, 32, II); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we already attach this range info to the declaration of the intrinsic via TableGen attrs. Is there any reason we need to attach it to each call as well?
llvm-project/llvm/include/llvm/IR/IntrinsicsNVVM.td
Lines 1759 to 1760 in ef50227
def int_nvvm_read_ptx_sreg_laneid | |
: PTXReadSRegIntrinsic_r32<[Range<RetIndex, 0, WARP_SIZE>]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this for completeness of the pass. Won't the range info through TableGen attrs be available only while InstructionSelection? and not before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know these attributes will be part of the IR and basically indistinguishable from if they were added to the call itself. Here is a quick demo:
https://cuda.godbolt.org/z/b6fEsKddc
Unless we find a use-case for adding to the call as well, I don't think it makes sense to include it here for completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense!
nvvm_read_ptx_sreg_laneid is used to represent the laneid of a thread running in a warp. Add a known range to the laneid intrinsic.