Skip to content

Conversation

weiweichen
Copy link
Contributor

@weiweichen weiweichen commented Oct 8, 2025

  • sort needs the comparator with strictly weak ordering, however current logic doesn't meet the Antisymmetry requirement with
sort 0x561ecd3d3db0,0x561eaba91d10  25
  weight 0.000000e+00,0.000000e+00
  size 650370,662754
  slot 732,733

Make the comparator logic strict weak order.

Fixes #162490

@weiweichen weiweichen changed the title [AMDGPU] Make sort ordering in AMDGPURewriteAGPRCopyMFMAImpl::eliminateSpillsOfReassignedVGPRs() strictly weak. [AMDGPU] Make sort ordering in AMDGPURewriteAGPRCopyMFMAImpl::eliminateSpillsOfReassignedVGPRs() strict weak. Oct 8, 2025
@weiweichen weiweichen requested a review from Copilot October 8, 2025 14:43
@weiweichen weiweichen requested review from arsenm and qcolombet and removed request for arsenm October 8, 2025 14:43
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a crash in the AMDGPU compiler by making the sort comparator in eliminateSpillsOfReassignedVGPRs() satisfy strict weak ordering requirements. The original comparator violated the antisymmetry requirement needed by C++ STL sort functions.

  • Fixed comparator logic to ensure strict weak ordering compliance
  • Added additional conditions to the tie-breaker comparison

Comment on lines 494 to 495
return (A->reg().stackSlotIndex() < B->reg().stackSlotIndex()) &&
((A->weight() <= B->weight()) && (A->getSize() <= B->getSize()));
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comparator logic is incorrect and still violates strict weak ordering. The current implementation can return false for both comp(a,b) and comp(b,a) when elements have the same stackSlotIndex, which breaks antisymmetry. For strict weak ordering, when primary criteria are equal, you should use a single tie-breaker. Consider using: return A->reg().stackSlotIndex() < B->reg().stackSlotIndex(); as the sole comparator, or if multiple criteria are needed, use lexicographic ordering like: if (A->weight() != B->weight()) return A->weight() < B->weight(); if (A->getSize() != B->getSize()) return A->getSize() < B->getSize(); return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using: return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();

As mentioned in the issue, this is not the right solution for making strict weak ordering.

But it's also true that my change has the issue with when comp(a,b) and comp(b,a) can both be false 🤦 . Updating the logic to use the lexicographic ordering suggestion.

@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: weiwei chen (weiweichen)

Changes

To fix the crash mentioned in
#162490

  • sort needs the comparator with strictly weak ordering, however current logic doesn't meet the Antisymmetry requirement with
sort 0x561ecd3d3db0,0x561eaba91d10  25
  weight 0.000000e+00,0.000000e+00
  size 650370,662754
  slot 732,733

Make the comparator logic strict weak order.


Full diff: https://github.com/llvm/llvm-project/pull/162493.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp (+3-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
index fedb694bfcc2a..caaa92999fd1a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
@@ -490,7 +490,9 @@ void AMDGPURewriteAGPRCopyMFMAImpl::eliminateSpillsOfReassignedVGPRs() const {
       return true;
 
     // Tie breaker by number to avoid need for stable sort
-    return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();
+    // The ordering have to be strictly weak.
+    return (A->reg().stackSlotIndex() < B->reg().stackSlotIndex()) &&
+           ((A->weight() <= B->weight()) && (A->getSize() <= B->getSize()));
   });
 
   // FIXME: The APIs for dealing with the LiveInterval of a frame index are

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testcase would be good


// Tie breaker by number to avoid need for stable sort
return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();
// The ordering have to be strictly weak.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The ordering have to be strictly weak.
// The ordering has to be strictly weak.


// Tie breaker by number to avoid need for stable sort
return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();
return (A->reg().stackSlotIndex() < B->reg().stackSlotIndex());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return (A->reg().stackSlotIndex() < B->reg().stackSlotIndex());
return A->reg().stackSlotIndex() < B->reg().stackSlotIndex();

@weiweichen
Copy link
Contributor Author

Testcase would be good

sure, any suggestion on how to add a test for this @arsenm ? 🙏

@weiweichen
Copy link
Contributor Author

Merge for now. Will add a test if @arsenm has suggestions on how to add test for this (sorry, I need some help with adding AMDGPU unit test)

@weiweichen weiweichen merged commit 497d648 into llvm:main Oct 8, 2025
9 checks passed
@arsenm
Copy link
Contributor

arsenm commented Oct 9, 2025

Merge for now. Will add a test if @arsenm has suggestions on how to add test for this (sorry, I need some help with adding AMDGPU unit test)

I'd probably just reduce from whatever you had. llvm-reduce the IR, then switch to llvm-reduce mir running the RA+post-RA pipeline. This case is probably quite difficult, but maybe using -stress-regalloc when reducing will help

svkeerthy pushed a commit that referenced this pull request Oct 9, 2025
…ateSpillsOfReassignedVGPRs()` strict weak. (#162493)

- [x] `sort` needs the comparator with strictly weak ordering, however
current logic doesn't meet the
[**Antisymmetry**](https://tanjim131.github.io/2020-05-22-strict-weak-ordering/#:~:text=Almost%20all%20C++%20STL%20containers,the%20person%20with%20greater%20height.)
requirement with

```
sort 0x561ecd3d3db0,0x561eaba91d10  25
  weight 0.000000e+00,0.000000e+00
  size 650370,662754
  slot 732,733
```

Make the comparator logic strict weak order.

Fixes #162490
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AMDGPU] sort in AMDGPURewriteAGPRCopyMFMAImpl::eliminateSpillsOfReassignedVGPRs() needs to have strict weak ordering
3 participants