Skip to content

[DirectX] Mismatched data layout causing validation errors about exceeding TGSM storage #163882

@Icohedron

Description

@Icohedron

99 DML shaders are failing to validate after #163587 with the error:

error: Total Thread Group Shared Memory storage is 43688, exceeded 32768.
Validation failed.

All 99 DML shaders have names of the form QuantizedGemm*. (e.g., QuantizedGemm_20480_16_0_uint4_packed32_float16_native_accum32_0)

Minimal reproducible test case

// compile args: -T cs_6_7 -E CSMain -enable-16bit-types -Fo output.dat
groupshared float16_t smem[10240];
[numthreads(1, 1, 1)] 
void CSMain() {
  smem[0] = 0;
}

Comparing the dxil output before and after the PR commit c87e0e8, the only difference is the data layout.

1c1
< target datalayout = "e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:32-f64:64-n8:16:32:64"
---
> target datalayout = "e-m:e-p:32:32-i1:8-i8:8-i16:32-i32:32-i64:64-f16:32-f32:32-f64:64-n8:16:32:64"
270c270
< !1 = !{!"clang version 22.0.0git (git@github.com:Icohedron/llvm-project.git 72c6e4b230ddb5ca85361e145e177245319b271e)"}
---
> !1 = !{!"clang version 22.0.0git (git@github.com:Icohedron/llvm-project.git c87e0e8fe0ea14dcd84e835c0f7b02c5b0edca70)"}

Compiling the same DML shader with DXC, DXC gives the shader a datalayout of

target datalayout = "e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:32-f64:64-n8:16:32:64"

which matches the data layout that Clang emitted before the PR.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Planning

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions