RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

jayfoad · 2024-04-03T15:32:53Z

Teach SelectionDAGISel::SelectCodeCommonn how to select
CONVERGENCECTRL_GLUE instead of doing it in the AMDGPU backend.

Teach SelectionDAGISel::SelectCodeCommonn how to select CONVERGENCECTRL_GLUE instead of doing it in the AMDGPU backend.

jayfoad · 2024-04-03T15:39:44Z

@ssahasra this patch reverts all the changes that you made to AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN in #71785. The reason is that I'd like to introduce new custom AMDGPUISD:: nodes for some convergent operations, and I'd like to be able to select them just by using tablegen patterns, without having to add the kind of C++ code that you added to SelectINTRINSIC_WO_CHAIN.

I added SDNPOptInGlue to int_amdgcn_readfirstlane so that TableGen knows to copy any glue input when it selects it. This is enough to pass the existing tests in test/CodeGen/AMDGPU/convergence-tokens.ll. I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

arsenm

This should be generically selectable. I didn't quite understand the special case morphing here

ssahasra · 2024-04-04T04:46:56Z

I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

Exactly. This change merely handles one convergent builtin. Also, it seems to me that glue is used as a hack for various different purposes. I don't even know if a convergent intrinsic might want to have a different additional glue, which is why I tried to first check if it's a CONVERGENCE_GLUE node in SelectINTRINSIC_WO_CHAIN().

jayfoad · 2024-04-04T09:25:35Z

I guess the down side is that we potentially have to add SDNPOptInGlue to every convergent intrinsic -- is that something you were trying to avoid?

Exactly. This change merely handles one convergent builtin.

I appreciate that you've made it Just Work for all intrinsics, but your approach "merely" handles intrinsics and I'd like to make it work for other convergent SDNodes too, while keeping friction low.

Also, it seems to me that glue is used as a hack for various different purposes. I don't even know if a convergent intrinsic might want to have a different additional glue, which is why I tried to first check if it's a CONVERGENCE_GLUE node in SelectINTRINSIC_WO_CHAIN().

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

This patch does two things, so perhaps it would be helpful to discuss them separately.

The first thing is adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually create the TargetOpcode::CONVERGENCECTRL_GLUE node. I really don't think this is controversial, based on the argument that all type of glue are handled the same way.

The second thing is adding an explicit SDNPOptInGlue to convergent intrinsic definitions so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually copy the glue operand. Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

ssahasra · 2024-04-08T04:16:27Z

I appreciate that you've made it Just Work for all intrinsics, but your approach "merely" handles intrinsics and I'd like to make it work for other convergent SDNodes too, while keeping friction low.

Yeah, that's a good point. Are these two overlapping sets? The set of intrinsics in MIR and the set of equivalent instructions. Is there a point in the lowering where both can be present, and convergence tokens have not disappeared yet?

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

The first thing is adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually create the TargetOpcode::CONVERGENCECTRL_GLUE node. I really don't think this is controversial, based on the argument that all type of glue are handled the same way.

Agreed.

The second thing is adding an explicit SDNPOptInGlue to convergent intrinsic definitions so that AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN doesn't have to manually copy the glue operand. Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

I am very much in favour for not having special C++ code for something that can happen automatically. The added assert is a bonus. I would first go for a TableGen static assert. At this point, the idea of gluing tokens is pretty much target independent, so the assert should also not be limited to AMDGPU.

ssahasra · 2024-04-10T06:20:18Z

I don't really buy this. Yes glue may be used for different reasons, but the way the instruction selector handles it is the same in all cases: it copies the glue operand from the input node to the output node. This is exactly what SDNPInGlue/SDNPOptInGlue enables in the tablegenerated selector.

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

The counter-example would be the glue operand on the SI_CALL/SI_TCRETURN family of instructions. Currently, the glue operand is a chain of outgoing arguments at the callsite. When I tried to append the token to that chain, it produced errors in the selection process. But then I just appended a second glue operand just for the token, and that seemed to work. I also put a comment in there explaining my assumptions. Maybe that was the wrong way to go about it?

arsenm · 2024-04-25T11:40:05Z

Would you be happier with this if there was some assertion (in tablegen, or generic selectiondag, or the AMDGPU backend) that all convergent SDNodes are marked with SDNPInGlue/SDNPOptInGlue?

This sounds good to me

Ok. What I am inferring here is that there is at most one glue operand on an instruction, and if multiple nodes need to be glued, they need to be chained (with more glue) to the same operand. This one operand will be preserved with the right properties, and then those glued nodes will be translated by the usual rules. Is that correct?

This was always my understanding of glue. It's a chain of glue

The counter-example would be the glue operand on the SI_CALL/SI_TCRETURN family of instructions. Currently, the glue operand is a chain of outgoing arguments at the callsite. When I tried to append the token to that chain, it produced errors in the selection process. But then I just appended a second glue operand just for the token, and that seemed to work. I also put a comment in there explaining my assumptions. Maybe that was the wrong way to go about it?

The call and tcreturn cases are quite different. The call is essentially a regular instruction, but the return is the end of the dag. What kind of errors?

ssahasra · 2024-05-14T08:11:20Z

@jayfoad, what's the plan for this change? Do you intend to put SDNPOptInGlue on all convergent ops in the TD files?

jayfoad · 2024-05-14T10:22:38Z

@jayfoad, what's the plan for this change? Do you intend to put SDNPOptInGlue on all convergent ops in the TD files?

Yes, if both reviewers are in favour of doing that, but it is not very high on my priority list.

I also tried splitting "the first thing [...] adding generic selection from ISD::CONVERGENCECTRL_GLUE -> TargetOpcode::CONVERGENCECTRL_GLUE" into a separate patch, but I ran into some technical problem that I never got round to debugging.

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically

bc752cd

Teach SelectionDAGISel::SelectCodeCommonn how to select CONVERGENCECTRL_GLUE instead of doing it in the AMDGPU backend.

jayfoad requested review from ssahasra and arsenm April 3, 2024 15:39

arsenm approved these changes Apr 3, 2024

View reviewed changes

jayfoad mentioned this pull request May 31, 2024

[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types #89217

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

jayfoad commented Apr 3, 2024

jayfoad commented Apr 3, 2024

arsenm left a comment

ssahasra commented Apr 4, 2024

jayfoad commented Apr 4, 2024

ssahasra commented Apr 8, 2024

ssahasra commented Apr 10, 2024

arsenm commented Apr 25, 2024

ssahasra commented May 14, 2024

jayfoad commented May 14, 2024

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

Are you sure you want to change the base?

RFC: [AMDGPU] Select CONVERGENCECTRL_GLUE generically #87509

Conversation

jayfoad commented Apr 3, 2024

jayfoad commented Apr 3, 2024

arsenm left a comment

Choose a reason for hiding this comment

ssahasra commented Apr 4, 2024

jayfoad commented Apr 4, 2024

ssahasra commented Apr 8, 2024

ssahasra commented Apr 10, 2024

arsenm commented Apr 25, 2024

ssahasra commented May 14, 2024

jayfoad commented May 14, 2024