Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gpu set blockSize primitive #20248

Merged
merged 2 commits into from Jul 21, 2022
Merged

Conversation

stonea
Copy link
Contributor

@stonea stonea commented Jul 18, 2022

To use this primitive put it in a foreach (or forall) loop that is going to be Gpuized. The symbol passed to it will be used to set the block size on kernel launch. For example, this loop will launch with a block size of 64:

on here.gpus[0] {
 foreach i in 0..127 {
   __primitive("gpu set blockSize", 64);
 }
}

The primitive will only be applied to the GPUIzed loop it is embedded inside. The primitive is applied indiscriminately and is not sensitive to control flow. I don't do any sort of error or bounds checking or anything like that as long-term this intent is that gets inserted by the compiler.

To verify that this indeed launches a kernel with a specific blocksize run with --verbose.

Copy link
Contributor

@ShreyasKhandekar ShreyasKhandekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Andy, this looks pretty good to me! I want to ask since you say the primitive applies only to the loop it is put inside, does the block size reset after each loop to either the default or the global configuration set by the environment variable controlling block size? I'm assuming it does but I could not see that in the code itself.

@stonea
Copy link
Contributor Author

stonea commented Jul 18, 2022

the block size reset after each loop to either the default or the global configuration set by the environment variable controlling block size?

Yes, any loop that doesn't have the primitive will get set to the default (that you get now in absence of this PR)

Copy link
Contributor

@e-kayrakli e-kayrakli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Andy! I would like to see that primitive is defined with PRIMITIVE_R and we don't have it at codegen time. Another good improvement would be better verbose GPU debugging. See my inline comments.

compiler/dyno/include/chpl/uast/prim-ops-list.h Outdated Show resolved Hide resolved
compiler/codegen/cg-expr.cpp Outdated Show resolved Hide resolved
compiler/optimizations/gpuTransforms.cpp Show resolved Hide resolved
compiler/optimizations/gpuTransforms.cpp Outdated Show resolved Hide resolved
@@ -648,14 +658,18 @@ static VarSymbol* generateNumThreads(BlockStmt* gpuLaunchBlock,
}

static CallExpr* generateGPUCall(GpuKernel& info, VarSymbol* numThreads) {
CallExpr* call = new CallExpr(PRIM_GPU_KERNEL_LAUNCH_FLAT);
CallExpr *call = new CallExpr(PRIM_GPU_KERNEL_LAUNCH_FLAT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) You know how to get me going with style matters.

I had been using int *foo, but I realized (maybe Vass pointed out?) that we use int* foo more widely in the compiler. So, I've been trying to switch, but old habits die hard. I think this file is mostly int* foo, but I see unfortunate exceptions. This'll be one more :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll keep it as is for now.

The argument for putting it next to the symbol is that habit it makes it less likely you'll accidentally do something like this:

int* ptrA, ptrB

Where you wanted ptrB to be a pointer too.

I'd be fine standardizing on some kind of coding style if that's what we want to do, but if so we really should be using clang-tidy or some linter tool or something to enforce it. Trying to enforce it through code reviews is a losing battle.

test/gpu/native/setBlockSizePrimitive.execopts Outdated Show resolved Hide resolved
---
Signed-off-by: Andy Stone <stonea@users.noreply.github.com>
---
Signed-off-by: Andy Stone <stonea@users.noreply.github.com>
@stonea stonea merged commit 8986fc4 into chapel-lang:main Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants