Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][arith] fix wrong floordivsi fold (#83079) #83248

Merged
merged 2 commits into from
Mar 22, 2024

Conversation

lipracer
Copy link
Member

Fixs #83079

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 28, 2024

@llvm/pr-subscribers-llvm-adt
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-arith

Author: long.chen (lipracer)

Changes

Fixs #83079


Full diff: https://github.com/llvm/llvm-project/pull/83248.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Arith/IR/ArithOps.cpp (+9-3)
  • (modified) mlir/test/Transforms/canonicalize.mlir (+9)
diff --git a/mlir/lib/Dialect/Arith/IR/ArithOps.cpp b/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
index 0f71c19c23b654..d370b7d04dea7e 100644
--- a/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+++ b/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
@@ -709,19 +709,25 @@ OpFoldResult arith::FloorDivSIOp::fold(FoldAdaptor adaptor) {
         }
         if (!aGtZero && !bGtZero) {
           // Both negative, return -a / -b.
-          APInt posA = zero.ssub_ov(a, overflowOrDiv0);
-          APInt posB = zero.ssub_ov(b, overflowOrDiv0);
-          return posA.sdiv_ov(posB, overflowOrDiv0);
+          return a.sdiv_ov(b, overflowOrDiv0);
         }
         if (!aGtZero && bGtZero) {
           // A is negative, b is positive, return - ceil(-a, b).
           APInt posA = zero.ssub_ov(a, overflowOrDiv0);
+          if (overflowOrDiv0)
+            return a;
           APInt ceil = signedCeilNonnegInputs(posA, b, overflowOrDiv0);
+          if (overflowOrDiv0)
+            return a;
           return zero.ssub_ov(ceil, overflowOrDiv0);
         }
         // A is positive, b is negative, return - ceil(a, -b).
         APInt posB = zero.ssub_ov(b, overflowOrDiv0);
+        if (overflowOrDiv0)
+          return a;
         APInt ceil = signedCeilNonnegInputs(a, posB, overflowOrDiv0);
+        if (overflowOrDiv0)
+          return a;
         return zero.ssub_ov(ceil, overflowOrDiv0);
       });
 
diff --git a/mlir/test/Transforms/canonicalize.mlir b/mlir/test/Transforms/canonicalize.mlir
index 2cf86b50d432f6..d2c2c12d323892 100644
--- a/mlir/test/Transforms/canonicalize.mlir
+++ b/mlir/test/Transforms/canonicalize.mlir
@@ -989,6 +989,15 @@ func.func @tensor_arith.floordivsi_by_one(%arg0: tensor<4x5xi32>) -> tensor<4x5x
   return %res : tensor<4x5xi32>
 }
 
+// CHECK-LABEL: func @arith.floordivsi_by_one_overflow
+func.func @arith.floordivsi_by_one_overflow() -> i64 {
+  %neg_one = arith.constant -1 : i64
+  %min_int = arith.constant -9223372036854775808 : i64
+  // CHECK: arith.floordivsi
+  %poision = arith.floordivsi %min_int, %neg_one : i64
+  return %poision : i64
+}
+
 // -----
 
 // CHECK-LABEL: func @arith.ceildivsi_by_one

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine upon a quick scan of the patch. Given the number or error code paths, I think it could be made more readable with something like constFoldBinaryOpConditional IIRC.

@pingshiyu
Copy link
Contributor

Not sure this is the right patch, please see this comment

@lipracer
Copy link
Member Author

lipracer commented Mar 3, 2024

@kuhar There are indeed many overflow states that need to be checked here, but I haven't come up with a good method yet. Can you give me a specific suggestion? Thank you.

@kuhar
Copy link
Member

kuhar commented Mar 3, 2024

@kuhar There are indeed many overflow states that need to be checked here, but I haven't come up with a good method yet. Can you give me a specific suggestion? Thank you.

Without knowing the exact details of the floordivsi algorithm, the way I'd think about it is that we only need to detect if the final overflow happened or not. And of course fold correctly when there is no overflow. If there are intermediate overflows, I'd think that they can fold into one of the two buckets: (a) those that indicate that the floordivsi overflows for these arguments, and (b) implementation bugs.

It might be helpful to extract this fold code to a unit test (say one of the APInt test files) and go from there. Pick a few inputs that are known to overflow and trace the values and intermediate overflows. Maybe write a separate check to decide if overflow happens for the given input instead of relying on some intermediate checks.

@kuhar
Copy link
Member

kuhar commented Mar 3, 2024

@lipracer Another idea: we could perform the division on APInt with more bits than the bitwidth of the folded integer type. Maybe this way it'd be easier to detect overflow. And then once we are confident there's no overflow, truncate it back to the desired bitwidth.

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lipracer could you open a separate PR for the APInt change? This will make it easier to review and land properly.

@lipracer
Copy link
Member Author

Ok.

@lipracer
Copy link
Member Author

submit a APInt change PR.

@lipracer
Copy link
Member Author

@lipracer could you open a separate PR for the APInt change? This will make it easier to review and land properly.

APInt change has already merged.

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine upon a quick scan but I won't have the time to understand the details of the expansion in near future. Please get a second approval before landing.

@lipracer
Copy link
Member Author

Thank you very much for your valuable suggestion.

@lipracer lipracer requested a review from Mogball March 19, 2024 02:17
@lipracer
Copy link
Member Author

@joker-eph I'd like someone to review my code, could you help me with that?

Copy link
Contributor

@Mogball Mogball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change almost certainly has performance implications. Do we have some way to measure or even guess at that? If this is fixing a correctness bug, I am also not the right person to verify that, but I trust the author in this

@joker-eph
Copy link
Collaborator

@pingshiyu would you be able to review the correctness of this change?

@lipracer
Copy link
Member Author

lipracer commented Mar 20, 2024

This change almost certainly has performance implications. Do we have some way to measure or even guess at that? If this is fixing a correctness bug, I am also not the right person to verify that, but I trust the author in this

Yes, this may indeed affect performance. This implementation is based on tensorflow xla expansion. For performance considerations, I think we can have a compilation option similar to ‘fast-expand’ for expansion. This change is only consistent with the behavior of llvm.

@pingshiyu
Copy link
Contributor

pingshiyu commented Mar 21, 2024

@pingshiyu would you be able to review the correctness of this change?

sure, I've just taken some time to check through the cases, lgtm re correctness.

thanks @lipracer!

maybe some extra tests would be nice for the edge cases, e.g. floordivsi(min_val, 1), and floordivsi(max_val, -1).

@lipracer
Copy link
Member Author

@pingshiyu Thanks, I have added more corner tests.

@lipracer
Copy link
Member Author

The CI error seems unrelated to this change. If the performance drops after merging, I will submit another change to enable "fast expand".

@joker-eph
Copy link
Collaborator

The premerge is green at HEAD: can you rebase? https://lab.llvm.org/buildbot/#/builders/271

@joker-eph
Copy link
Collaborator

Nevermind I misread the log, it's failing on building flang by running out of memory:

�_bk;t=1711039712398�C:\ws\src\flang\include\flang\Evaluate\tools.h(236): fatal error C1060: compiler is out of heap space

I concur that it is unrelated, and all the MLIR tests are passing.

@lipracer
Copy link
Member Author

Thanks, I will rebase on head.

1) fix floordivsi error expand logic
2) fix floordivsi fold did't check overflow stat

Fixs llvm#83079
@lipracer lipracer merged commit 631e54a into llvm:main Mar 22, 2024
4 checks passed
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
@lipracer lipracer deleted the fix-arith-fold branch March 24, 2024 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants