Skip to content

Fix GVN and SROA miscompilation of min precision vector element access#8269

Open
alsepkow wants to merge 6 commits intomicrosoft:mainfrom
alsepkow:user/alsepkow/fix-min-precision-opt-bugs
Open

Fix GVN and SROA miscompilation of min precision vector element access#8269
alsepkow wants to merge 6 commits intomicrosoft:mainfrom
alsepkow:user/alsepkow/fix-min-precision-opt-bugs

Conversation

@alsepkow
Copy link
Contributor

Summary

Fixes three related optimizer bugs that cause miscompilation of min precision vector element access ([] operator) on min16float, min16int, and min16uint types at optimization levels O1+.

Resolves #8268

Root Cause

DXC's data layout pads min precision types (i16:32, f16:32). The HLSL change in DataLayout::getTypeSizeInBits makes vector sizes use alloc size per element (e.g., <3 x i16> = 96 bits), but scalar getTypeSizeInBits(i16) returns the primitive width (16 bits). This inconsistency causes three bugs:

  1. GVN ICE: CanCoerceMustAliasedValueToLoad creates a padded-width integer (i96) then attempts a bitcast from the 48-bit LLVM vector type — assert fires.
  2. GVN incorrect store forwarding: processLoad forwards a zeroinitializer store past partial element stores because MemoryDependence uses padded sizes for aliasing.
  3. SROA element misindexing (primary bug): getNaturalGEPRecursively uses getTypeSizeInBits(i16)/8 = 2 for element offsets, while GEP uses getTypeAllocSize(i16) = 4. Byte offset 4 (element 1) maps to index 4/2 = 2 instead of 4/4 = 1, causing SROA to misplace or eliminate element stores.

Changes

lib/Transforms/Scalar/GVN.cpp

  • CanCoerceMustAliasedValueToLoad: Reject coercion when type sizes include padding (getTypeSizeInBits != getPrimitiveSizeInBits)
  • processLoad StoreInst handler: Skip store-to-load forwarding for padded types

lib/Transforms/Scalar/SROA.cpp

  • getNaturalGEPRecursively: Use getTypeAllocSizeInBits for vector element size to match GEP offset stride
  • isVectorPromotionViable: Same fix for element size calculation
  • AllocaSliceRewriter constructor: Same fix for ElementSize

Testing

  • All 6 min precision ArrayOperator tests (StaticAccess + DynamicAccess × 3 types) pass with the fix
  • Verified optimized DXIL output retains all 3 element stores with correct indices

Co-authored

This fix was investigated and implemented with the assistance of GitHub Copilot (AI pair programming). The root cause analysis — tracing the bug through -print-after-all pass dumps, identifying SROA as the culprit, and understanding the getTypeSizeInBits vs getTypeAllocSize mismatch — was a collaborative effort.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Multiple optimization passes mishandle min precision vector types due to
DXC's padded data layout (i16:32, f16:32), where getTypeSizeInBits
returns padded sizes for vectors but primitive sizes for scalars.

Bug 1 - GVN ICE: CanCoerceMustAliasedValueToLoad computes a padded
integer type (e.g., i96 for <3 x half>) then attempts to bitcast from
the 48-bit LLVM type, triggering an assert. Fix: reject coercion when
type sizes include padding.

Bug 2 - GVN incorrect store forwarding: processLoad forwards a 'store
<3 x i16> zeroinitializer' past partial element stores because
MemoryDependence uses padded sizes for aliasing. Fix: skip store-to-load
forwarding for padded types.

Bug 3 - SROA element misindexing: getNaturalGEPRecursively uses
getTypeSizeInBits (2 bytes for i16) for element offsets while GEP uses
getTypeAllocSize (4 bytes with i16:32). Byte offset 4 (element 1) maps
to index 4/2=2 instead of 4/4=1, causing SROA to misplace or eliminate
element stores. Fix: use getTypeAllocSizeInBits consistently for vector
element sizes throughout SROA.

Fixes microsoft#8268

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 14, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

github-actions bot and others added 4 commits March 14, 2026 06:53
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add lit tests that verify GVN and SROA handle DXC's padded min precision
types (i16:32, f16:32) correctly.

GVN tests (test/Transforms/GVN/min-precision-padding.ll):
- Verify store-to-load forwarding is blocked for padded vector types
- Verify type coercion is rejected for padded types
- Cover <3 x i16>, <3 x half>, <5 x i16>, <8 x half>

SROA tests (test/Transforms/SROA/min-precision-padding.ll):
- Verify GEP element indices use alloc size (4 bytes) not prim size (2 bytes)
- Verify all element stores survive with correct indices
- Cover <3 x i16>, <3 x half>, <5 x i16>, <8 x half>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The processLoad padding guard was unconditionally blocking store-to-load
forwarding for all padded vector types, including trivially correct same-type
forwarding. Narrow the check to only fire when stored and loaded types differ,
preserving defense-in-depth for cross-type forwarding while allowing valid
same-type optimizations for i16/half vectors.

Also fix the comment to accurately describe the root cause: BasicAA returns
MustAlias at offset 0 regardless of access sizes, causing partial element
stores to appear as Defs to MemoryDependence.

Add test cases verifying same-type forwarding still works for padded types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

GVN and SROA miscompile min precision vector element access

1 participant