[0009] A proposal for DXIL Function Scalarization #62

farzonl · 2024-09-05T16:26:27Z

This covers the motivations, background of how Scalarization is working in DXC
and the approach to tasking we should take.
This is also and updated proposal to what was decided in team meeting

It does not cover Data scalarization.

This covers the motivations, background of how Scalarization is working in DXC and the approach to tasking we should take. This is also and updated proposal to what was decided in team meeting

damyanp

Some comments mostly calling out typos and bits I had trouble following.

proposals/NNNN-DXIL-Scalarization.md

damyanp · 2024-09-06T19:20:34Z

proposals/NNNN-DXIL-Scalarization.md

+
+`DXILOpLowering` is also the last place for a functional reason. The scalarizer
+pass only operates on llvm intrinsics that are `TriviallyVectorizable`. Further
+it only converts the scalarized llvm intrinsics meaning there is no way for it


Suggested change

it only converts the scalarized llvm intrinsics meaning there is no way for it

it only converts to(?) scalarized llvm intrinsics meaning there is no way for it

no the is correct here.

Further it only converts the scalarized llvm intrinsics

If this is correct, then I'm afraid I'm not sure I understand what it means. I would have thought that the scalarizer converts the vectorized intrinics to scalar ones. So I guess I've got a gap in my understanding here.

So when i say llvm intrinsics i mean intrinsics defined in intrinsics.td. So im trying to make a distinction between the intrinsics that are exposed to all backends in intrinsics.td and the direct x intrinsics. The distinction being that if we want this to work with our backend we need a way of exposing the direct x intrinsics to the scalarizer pass.

proposals/NNNN-DXIL-Scalarization.md

python3kgae · 2024-09-06T19:45:10Z

proposals/NNNN-DXIL-Scalarization.md

+ makes the most sense to use the Scalarizer pass.
+- [Scalarizer.cpp](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/Scalarizer.cpp)
+
+The `scalarizer` pass with the `-scalarize-load-store` and 


What is the plan for static global variables with vector or vector array type and groupshared variables with vector array type?

Those are handled by the same scalarizer pass with the load store flag. Here is a groushared example

groupshared float3 sharedData[2]; export void fn2() { sharedData[0] = float3(1.0f, 2.0f, 3.0f); sharedData[1] = float3(2.0f, 4.0f, 6.0f); }

generates the following IR

@"sharedData" = local_unnamed_addr addrspace(3) global [2 x <3 x float>] zeroinitializer, align 16 define void @fn2 () local_unnamed_addr { store <3 x float> <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00>, ptr addrspace(3) @"sharedData", align 16 store <3 x float> <float 2.000000e+00, float 4.000000e+00, float 6.000000e+00>, ptr addrspace(3) getelementptr inbounds (i8, ptr addrspace(3) @"sharedData", i32 16), align 16 ret void }

opt.exe -S -passes=scalarizer -scalarize-load-store scalarize_store.ll

@"sharedData" = local_unnamed_addr addrspace(3) global [2 x <3 x float>] zeroinitializer, align 16 define void @"?fn2@@YAXXZ"() local_unnamed_addr { store float 1.000000e+00, ptr addrspace(3) @"sharedData", align 16 store float 2.000000e+00, ptr addrspace(3) getelementptr (float, ptr addrspace(3) @"sharedData", i32 1), align 4 store float 3.000000e+00, ptr addrspace(3) getelementptr (float, ptr addrspace(3) @"sharedData", i32 2), align 8 store float 2.000000e+00, ptr addrspace(3) getelementptr inbounds (i8, ptr addrspace(3) @"sharedData", i32 16), align 16 store float 4.000000e+00, ptr addrspace(3) getelementptr (float, ptr addrspace(3) getelementptr inbounds (i8, ptr addrspace(3) @"sharedData", i32 16), i32 1), align 4 store float 6.000000e+00, ptr addrspace(3) getelementptr (float, ptr addrspace(3) getelementptr inbounds (i8, ptr addrspace(3) @"sharedData", i32 16), i32 2), align 8 ret void }

This type of @"sharedData" is still [2 x <3 x float>] after opt.
We need it to be [2 x [3 x float]] for groupshared.

For static global, we could split it to 3 [2 x float] globals if possible.

My previous proposal covered global scope, but this one doesn't. i'll make that more explicit. This won't change the layout of global scope just how we access that data. The last piece that DXC does that this pass doesn't is to encode sharedData as a 6 element 1d array. All the store operations are correct.

!20 = !DIGlobalVariable(name: "sharedData", linkageName: "\01?sharedData@@3PAV?$vector@M$02@@A.v.1dim", scope: !0, file: !1, line: 1, type: !21, isLocal: false, isDefinition: true, variable: [6 x float] addrspace(3)* @"\01?sharedData@@3PAV?$vector@M$02@@A.v.1dim")

So we'll have module pass to cover global variables?
Flat multiple dim array into 1d array should be a separate pass, not part of scalarization.

There will have to be another pass to handle data layout like this. Thats already something I'm considering because of memory intrinsics.

python3kgae · 2024-09-09T18:28:57Z

proposals/NNNN-DXIL-Scalarization.md

+* Author(s): [Farzon Lotfi](https://github.com/farzonl)
+* Sponsor: [Farzon Lotfi](https://github.com/farzonl)
+* Status: **Under Consideration**
+* Impacted Projects: Clang


in our past proposals in HLSL specs we have used Clang to indicate the upstream LLVM project and DXC to indicate DXC, and Clang and DXC to indicate both. I'm keeping that same distinction, but open to changing it.

I think we should probably just remove the Impacted Projects line from the template in this repo. Nothing goes into this repo that isn't targeting LLVM/Clang.

… not solve

damyanp · 2024-09-10T23:01:55Z

proposals/NNNN-DXIL-Scalarization.md

+
+`DXILOpLowering` is also the last place for a functional reason. The scalarizer
+pass only operates on llvm intrinsics that are `TriviallyVectorizable`. Further
+it only converts the scalarized llvm intrinsics meaning there is no way for it


Further it only converts the scalarized llvm intrinsics

If this is correct, then I'm afraid I'm not sure I understand what it means. I would have thought that the scalarizer converts the vectorized intrinics to scalar ones. So I guess I've got a gap in my understanding here.

A proposal for DXIL Scalarization

eb1b366

This covers the motivations, background of how Scalarization is working in DXC and the approach to tasking we should take. This is also and updated proposal to what was decided in team meeting

damyanp reviewed Sep 6, 2024

View reviewed changes

python3kgae reviewed Sep 6, 2024

View reviewed changes

python3kgae reviewed Sep 9, 2024

View reviewed changes

address pr comments

3169aef

farzonl force-pushed the DXIL-Scalarization-Proposal branch from 48aff81 to 3169aef Compare September 9, 2024 20:40

run remark md formatter

22fefa0

farzonl changed the title ~~A proposal for DXIL Scalarization~~ A proposal for DXIL Function Scalarization Sep 10, 2024

make language clearer around what the scalarizer pass solves and does…

732baf2

… not solve

python3kgae approved these changes Sep 10, 2024

View reviewed changes

damyanp approved these changes Sep 10, 2024

View reviewed changes

Add proposal number

4112bcc

farzonl force-pushed the DXIL-Scalarization-Proposal branch from a9e708e to 4112bcc Compare September 11, 2024 16:46

llvm-beanz approved these changes Sep 11, 2024

View reviewed changes

farzonl merged commit 9fea720 into llvm:main Sep 11, 2024

This was referenced Sep 12, 2024

[workstream] DXIL Legalization and Lowering #27

Open

[milestone] Compile particle_life.hlsl #20

Open

farzonl deleted the DXIL-Scalarization-Proposal branch September 22, 2024 19:30

farzonl changed the title ~~A proposal for DXIL Function Scalarization~~ [0009] A proposal for DXIL Function Scalarization Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[0009] A proposal for DXIL Function Scalarization #62

[0009] A proposal for DXIL Function Scalarization #62

farzonl commented Sep 5, 2024 •

edited

Loading

damyanp left a comment

damyanp Sep 6, 2024

farzonl Sep 9, 2024

damyanp Sep 10, 2024

farzonl Sep 10, 2024

python3kgae Sep 6, 2024

farzonl Sep 6, 2024

python3kgae Sep 6, 2024

farzonl Sep 6, 2024 •

edited

Loading

python3kgae Sep 6, 2024

farzonl Sep 9, 2024

python3kgae Sep 9, 2024

farzonl Sep 9, 2024 •

edited

Loading

llvm-beanz Sep 11, 2024

damyanp Sep 10, 2024

	it only converts the scalarized llvm intrinsics meaning there is no way for it
	it only converts to(?) scalarized llvm intrinsics meaning there is no way for it

[0009] A proposal for DXIL Function Scalarization #62

[0009] A proposal for DXIL Function Scalarization #62

Conversation

farzonl commented Sep 5, 2024 • edited Loading

damyanp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

farzonl Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

farzonl Sep 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

farzonl commented Sep 5, 2024 •

edited

Loading

farzonl Sep 6, 2024 •

edited

Loading

farzonl Sep 9, 2024 •

edited

Loading