Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Eliminate workaround for optimizing maxval Fortran intrinsic #65814

Closed
wants to merge 1 commit into from

Conversation

unterumarmung
Copy link
Contributor

@unterumarmung unterumarmung commented Sep 8, 2023

Following #65213, the lowering of the arith.maxf operation has transitioned from using the maxnum LLVM intrinsic to maximum. This modification renders the statement in the deleted here comment obsolete and the associated workaround unnecessary. Consequently, this commit removes the workaround and rectifies related test cases.

Following llvm#65213, the optimization of the `arith.maxf` operation has transitioned from using the `maxnum` LLVM intrinsic to `maximum`. This modification renders the statement in the previous commit obsolete and the associated workaround unnecessary. Consequently, this commit removes the workaround and rectifies related test cases.
@unterumarmung unterumarmung requested a review from a team as a code owner September 8, 2023 21:44
@github-actions github-actions bot added the flang Flang issues not falling into any other category label Sep 8, 2023
@unterumarmung
Copy link
Contributor Author

@vzakhari It seems like you've written this workaround. Could you please review the PR and tell if this change is ok?

@unterumarmung
Copy link
Contributor Author

Also, if this PR is going to be merged, I would do it after #65800 is landed.

@vzakhari
Copy link
Contributor

vzakhari commented Sep 8, 2023

Thank you! Would you be able to check what happens with F128 maxval case right now? Does LLVM call fmaxl (with and w/o -ffast-math)?

@unterumarmung
Copy link
Contributor Author

Thank you! Would you be able to check what happens with F128 maxval case right now? Does LLVM call fmaxl (with and w/o -ffast-math)?

I'll try. Could you share some Fortran program reproducer, please?

@unterumarmung
Copy link
Contributor Author

So I used this program to reproduce:

program quad_precision_maxval
  use, intrinsic :: iso_fortran_env, only: real128

  real(real128) :: quad_array(5)
  quad_array = [1.0_16, 2.0_16, 3.0_16, 4.0_16, 5.0_16]

  max_value = maxval(quad_array)
  print *, "Maximum value in the array:", max_value
end program quad_precision_maxval
  1. No fast math:
$ ./build/bin/flang-new f.f90
$ ./a.out
Maximum value in the array: 5
$ objdump -d a.out | grep fmaxl

Works fine and no fmaxl mentions

  1. Fast math enabled:
$ ./build/bin/flang-new -ffast-math f.f90
$ ./a.out
Maximum value in the array: 5
$ objdump -d a.out | grep fmaxl

Works fine and no fmaxl mentions


So, is this enough or have I missed something?

@vzakhari
Copy link
Contributor

vzakhari commented Sep 9, 2023

Yes, it might be a bit complex to properly check this.

Please try this:

subroutine test(x, t)
  real(16) :: x(100), t
  t = maxval(x)
end subroutine test

flang-new -O2 -ffast-math -c maxval.f90:

LLVM ERROR: Cannot select: t27: f128 = fmaximum nnan ninf nsz arcp contract afn reassoc t26, t25
  t26: f128,ch = load<(load (s128) from %ir.scevgep, !tbaa !3)> t0, t5, undef:i64
    t5: i64 = add t2, t4
      t2: i64,ch = CopyFromReg t0, Register:i64 %4
        t1: i64 = Register %4
      t4: i64,ch = CopyFromReg t0, Register:i64 %0
        t3: i64 = Register %0
    t9: i64 = undef
  t25: f128 = fmaximum nnan ninf nsz arcp contract afn reassoc t24, t21
    t24: f128,ch = load<(load (s128) from %ir.scevgep2, !tbaa !3)> t0, t23, undef:i64
      t23: i64 = add t5, Constant:i64<-16>
        t5: i64 = add t2, t4
          t2: i64,ch = CopyFromReg t0, Register:i64 %4
            t1: i64 = Register %4
          t4: i64,ch = CopyFromReg t0, Register:i64 %0
            t3: i64 = Register %0
        t22: i64 = Constant<-16>
      t9: i64 = undef
    t21: f128 = fmaximum nnan ninf nsz arcp contract afn reassoc t20, t17
      t20: f128,ch = load<(load (s128) from %ir.scevgep4, !tbaa !3)> t0, t19, undef:i64
        t19: i64 = add t5, Constant:i64<-32>
          t5: i64 = add t2, t4
            t2: i64,ch = CopyFromReg t0, Register:i64 %4
              t1: i64 = Register %4
            t4: i64,ch = CopyFromReg t0, Register:i64 %0
              t3: i64 = Register %0
          t18: i64 = Constant<-32>
        t9: i64 = undef
      t17: f128 = fmaximum nnan ninf nsz arcp contract afn reassoc t16, t13
        t16: f128,ch = load<(load (s128) from %ir.scevgep6, !tbaa !3)> t0, t15, undef:i64
          t15: i64 = add t5, Constant:i64<-48>
            t5: i64 = add t2, t4
              t2: i64,ch = CopyFromReg t0, Register:i64 %4
                t1: i64 = Register %4
              t4: i64,ch = CopyFromReg t0, Register:i64 %0
                t3: i64 = Register %0
            t14: i64 = Constant<-48>
          t9: i64 = undef
        t13: f128 = fmaximum nnan ninf nsz arcp contract afn reassoc t10, t12
          t10: f128,ch = load<(load (s128) from %ir.scevgep8, !tbaa !3)> t0, t7, undef:i64
            t7: i64 = add t5, Constant:i64<-64>
              t5: i64 = add t2, t4
                t2: i64,ch = CopyFromReg t0, Register:i64 %4
                  t1: i64 = Register %4
                t4: i64,ch = CopyFromReg t0, Register:i64 %0
                  t3: i64 = Register %0
              t6: i64 = Constant<-64>
            t9: i64 = undef
          t12: f128,ch = CopyFromReg t0, Register:f128 %1
            t11: f128 = Register %1
In function: test_

@unterumarmung
Copy link
Contributor Author

Oh, so I forgot about optimizations flags in Flang..
Should we file an issue for the LLVM codegen and close this for now? I guess it should not fail with ICE

@vzakhari
Copy link
Contributor

vzakhari commented Sep 9, 2023

Yes, I am reluctant to merge this change if it is going to cause tests failures. Filing an issue for LLVM would be great.

@unterumarmung
Copy link
Contributor Author

@vzakhari please take a look #65886

@unterumarmung
Copy link
Contributor Author

unterumarmung commented Sep 12, 2023

@vzakhari
Maybe we should implement this change anyways, but keep the workaround for the 128-bit numbers? I guess they are not as common as any other. And I think it is better to lower to the llvm intrinsics as much as possible?

@vzakhari
Copy link
Contributor

I am not sure that mlir::arith::MaxFOp has a lot of benefits over the select here. Most code generators will recognize the max idiom and optimize it accordingly based on the fast math flags provided. Can you please share the use case that you care about?

@unterumarmung
Copy link
Contributor Author

There is no actual usecase. I just thought that narrowing the workaround down to the cases it is actually needed for would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants