-
Notifications
You must be signed in to change notification settings - Fork 16
Add shared mem-space for SLM allocs #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
| while (parentOp) { | ||
| // Check if the current parent is a gpu.launch operation | ||
| if (llvm::isa<mlir::gpu::LaunchOp>(parentOp)) { | ||
| return true; | ||
| } | ||
| // Move to the parent operation | ||
| parentOp = parentOp->getParentOp(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're probably looking for getParentOfType here. Another option would be to just target the pass at gpu module instead. The latter is probably also useful if you want to check for the total allocated memory size without additional analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to just target the pass at gpu module instead
Targeting the pass to a gpu module would be nice, however it won't work with the existing pipeline. There are some passes that must be run after allocs-to-slm but before conversion to the gpu module (linalg-to-xegpu, linalg-to-loops, insert-gpu-allocs).
Some of the passes (linalg-to-xegpu and linalg-to-loops) might be adjusted by retargeting them from funcOp to gpuFuncOp, but insert-gpu-allocs won't work that way, as its logic expects to be applied to a func.func containing a gpu.launch block (and not a GPU module).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So only replaced the while-loop with getParentOfType for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was thinking about something else while typing. I meant the launch gpu op, not gpu module. Anyway, the change is trivial.
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Fixes #393
The new pass goes through each
memref.alloc()inside of agpu.launch block {}and sets shared mem-space attribute (int value of 3) to the resulted memref (it ignores all memrefs that already have explicitly specified mem-space).