-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gpu set blockSize
primitive
#20248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Andy, this looks pretty good to me! I want to ask since you say the primitive applies only to the loop it is put inside, does the block size reset after each loop to either the default or the global configuration set by the environment variable controlling block size? I'm assuming it does but I could not see that in the code itself.
Yes, any loop that doesn't have the primitive will get set to the default (that you get now in absence of this PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Andy! I would like to see that primitive is defined with PRIMITIVE_R and we don't have it at codegen time. Another good improvement would be better verbose GPU debugging. See my inline comments.
@@ -648,14 +658,18 @@ static VarSymbol* generateNumThreads(BlockStmt* gpuLaunchBlock, | |||
} | |||
|
|||
static CallExpr* generateGPUCall(GpuKernel& info, VarSymbol* numThreads) { | |||
CallExpr* call = new CallExpr(PRIM_GPU_KERNEL_LAUNCH_FLAT); | |||
CallExpr *call = new CallExpr(PRIM_GPU_KERNEL_LAUNCH_FLAT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) You know how to get me going with style matters.
I had been using int *foo
, but I realized (maybe Vass pointed out?) that we use int* foo
more widely in the compiler. So, I've been trying to switch, but old habits die hard. I think this file is mostly int* foo
, but I see unfortunate exceptions. This'll be one more :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll keep it as is for now.
The argument for putting it next to the symbol is that habit it makes it less likely you'll accidentally do something like this:
int* ptrA, ptrB
Where you wanted ptrB
to be a pointer too.
I'd be fine standardizing on some kind of coding style if that's what we want to do, but if so we really should be using clang-tidy
or some linter tool or something to enforce it. Trying to enforce it through code reviews is a losing battle.
--- Signed-off-by: Andy Stone <stonea@users.noreply.github.com>
--- Signed-off-by: Andy Stone <stonea@users.noreply.github.com>
To use this primitive put it in a
foreach
(orforall
) loop that is going to be Gpuized. The symbol passed to it will be used to set the block size on kernel launch. For example, this loop will launch with a block size of 64:The primitive will only be applied to the GPUIzed loop it is embedded inside. The primitive is applied indiscriminately and is not sensitive to control flow. I don't do any sort of error or bounds checking or anything like that as long-term this intent is that gets inserted by the compiler.
To verify that this indeed launches a kernel with a specific blocksize run with
--verbose
.