Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jayfoad
Copy link
Contributor

@jayfoad jayfoad commented Apr 3, 2024

Teach SelectionDAGISel::SelectCodeCommonn how to select
CONVERGENCECTRL_GLUE instead of doing it in the AMDGPU backend.

Teach SelectionDAGISel::SelectCodeCommonn how to select
CONVERGENCECTRL_GLUE instead of doing it in the AMDGPU backend.
@jayfoad
Copy link
Contributor Author

jayfoad commented Apr 3, 2024

@ssahasra this patch reverts all the changes that you made to AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN in #71785. The reason is that I'd like to introduce new custom AMDGPUISD:: nodes for some convergent operations, and I'd like to be able to select them just by using tablegen patterns, without having to add the kind of C++ code that you added to SelectINTRINSIC_WO_CHAIN.

I added SDNPOptInGlue to int_amdgcn_readfirstlane so that TableGen knows to copy any glue input when it selects it. This is enough to pass the existing tests in test/CodeGen/AMDGPU/convergence-tokens.ll. I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

@jayfoad jayfoad requested review from ssahasra and arsenm April 3, 2024 15:39
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be generically selectable. I didn't quite understand the special case morphing here

@ssahasra
Copy link
Collaborator

ssahasra commented Apr 4, 2024

I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

Exactly. This change merely handles one convergent builtin. Also, it seems to me that glue is used as a hack for various different purposes. I don't even know if a convergent intrinsic might want to have a different additional glue, which is why I tried to first check if it's a CONVERGENCE_GLUE node in SelectINTRINSIC_WO_CHAIN().

@jayfoad
Copy link
Contributor Author

jayfoad commented Apr 4, 2024

I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

Exactly. This change merely handles one convergent builtin.

I appreciate that you've made it Just Work for all intrinsics, but your approach "merely" handles intrinsics and I'd like to make it work for other convergent SDNodes too, while keeping friction low.

Also, it seems to me that glue is used as a hack for various different purposes. I don't even know if a convergent intrinsic might want to have a different additional glue, which is why I tried to first check if it's a CONVERGENCE_GLUE node in SelectINTRINSIC_WO_CHAIN().

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

This patch does two things, so perhaps it would be helpful to discuss them separately.

The first thing is adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually create the TargetOpcode::CONVERGENCECTRL_GLUE node. I really don't think this is controversial, based on the argument that all type of glue are handled the same way.

The second thing is adding an explicit SDNPOptInGlue to convergent intrinsic definitions so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually copy the glue operand. Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

@ssahasra
Copy link
Collaborator

ssahasra commented Apr 8, 2024

I appreciate that you've made it Just Work for all intrinsics, but your approach "merely" handles intrinsics and I'd like to make it work for other convergent SDNodes too, while keeping friction low.

Yeah, that's a good point. Are these two overlapping sets? The set of intrinsics in MIR and the set of equivalent instructions. Is there a point in the lowering where both can be present, and convergence tokens have not disappeared yet?

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

The first thing is adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually create the TargetOpcode::CONVERGENCECTRL_GLUE node. I really don't think this is controversial, based on the argument that all type of glue are handled the same way.

Agreed.

The second thing is adding an explicit SDNPOptInGlue to convergent intrinsic definitions so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually copy the glue operand. Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

I am very much in favour for not having special C++ code for something that can happen automatically. The added assert is a bonus. I would first go for a TableGen static assert. At this point, the idea of gluing tokens is pretty much target independent, so the assert should also not be limited to AMDGPU.

@ssahasra
Copy link
Collaborator

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

The counter-example would be the glue operand on the SI_CALL/SI_TCRETURN family of instructions. Currently, the glue operand is a chain of outgoing arguments at the callsite. When I tried to append the token to that chain, it produced errors in the selection process. But then I just appended a second glue operand just for the token, and that seemed to work. I also put a comment in there explaining my assumptions. Maybe that was the wrong way to go about it?

@arsenm
Copy link
Contributor

arsenm commented Apr 25, 2024

Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

This sounds good to me

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

This was always my understanding of glue. It's a chain of glue

The counter-example would be the glue operand on the SI_CALL/SI_TCRETURN family of instructions. Currently, the glue operand is a chain of outgoing arguments at the callsite. When I tried to append the token to that chain, it produced errors in the selection process. But then I just appended a second glue operand just for the token, and that seemed to work. I also put a comment in there explaining my assumptions. Maybe that was the wrong way to go about it?

The call and tcreturn cases are quite different. The call is essentially a regular instruction, but the return is the end of the dag. What kind of errors?

@ssahasra
Copy link
Collaborator

@jayfoad, what's the plan for this change? Do you intend to put SDNPOptInGlue on all convergent ops in the TD files?

@jayfoad
Copy link
Contributor Author

jayfoad commented May 14, 2024

@jayfoad, what's the plan for this change? Do you intend to put SDNPOptInGlue on all convergent ops in the TD files?

Yes, if both reviewers are in favour of doing that, but it is not very high on my priority list.

I also tried splitting "the first thing [...] adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE" into a separate patch, but I ran into some technical problem that I never got round to debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants