Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

Closed

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Jan 8, 2024

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.

When unstrucuted control flow is used inside a collapsed wsloop, the
inner-most (collapsed) loop statements are skipped (loop control is
handled by the wsloop operation). However, blocks are still created
without knowledge of the loop collapse. Without any loop control code
generated for the collapsed loop, no branch would be added at the end of
the block for the end-loop statement, resulting in invalid IR (an
unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next
out-most end-loop statement. This block is already successfully
recognised as the successor block.

This patch adds an additional condition to add a branch to this
successor block. I believe this should only match in cases which
previously would have produced an unterminated block and so this should
not effect any already working code construct.

One alternative solution that was considered was to handle this in the
previous else if case where successor == eval.constrolSuccessor. This
condition does not match here because the control successor is the
beginning of the loop construct, whereas the successor is the next
outer-most loop construct. Removing this condition led to numerous test
failures.

This passes llvm-testsuite and can produce correct results from spec2017
**rate** (no openmp). Please advise if other testing is necessary.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Jan 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 8, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.


Full diff: https://github.com/llvm/llvm-project/pull/77329.diff

2 Files Affected:

  • (modified) flang/lib/Lower/Bridge.cpp (+6)
  • (added) flang/test/Lower/OpenMP/wsloop-unstructured.f90 (+63)
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
                successor == eval.controlSuccessor)
         // Exit from a degenerate, empty construct block.
         genBranch(eval.parentConstruct->constructExit->block);
+      else if (successor->block)
+        // Catch case with omp collapse(n) when dealing with an end-do statement
+        // which has been collapsed into an outer loop and unstructured control
+        // flow is used. Create a fall through branch to the successor (the
+        // outer loop). This is not known to affect other cases.
+        genBranch(successor->block);
     }
   }
 
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+  integer, intent(in) :: imax, jmax
+  real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+  integer :: i, j, ii
+
+  ! collapse(2) is needed to reproduce the issue
+  !$omp parallel do collapse(2)
+  do j = 1, jmax
+    do i = 1, imax
+      do  ii = 1, imax ! note that this loop is not collapsed
+        if (x(i,j) < y(ii,j)) then
+          ! exit needed to force unstructured control flow
+          exit
+        endif
+      enddo
+    enddo
+  enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL:   func.func @_QPsub(
+! CHECK-SAME:                      %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME:                      %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME:                      %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME:                      %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK:             omp.wsloop for  (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK:               cf.br ^bb1
+! CHECK:             ^bb1:
+! CHECK:               cf.br ^bb2
+! CHECK:             ^bb2:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb3:
+! [...]
+! CHECK:               %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK:               cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK:             ^bb4:
+! [...]
+! CHECK:               %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK:               cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK:             ^bb5:
+! CHECK:               cf.br ^bb7
+! CHECK:             ^bb6:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb7:
+! CHECK:               cf.br ^bb8
+! CHECK:             ^bb8:
+! CHECK:               omp.yield
+! CHECK:             }
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           cf.br ^bb1
+! CHECK:         ^bb1:
+! CHECK:           return
+! CHECK:         }

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 8, 2024

@llvm/pr-subscribers-flang-openmp

Author: Tom Eccles (tblah)

Changes

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.


Full diff: https://github.com/llvm/llvm-project/pull/77329.diff

2 Files Affected:

  • (modified) flang/lib/Lower/Bridge.cpp (+6)
  • (added) flang/test/Lower/OpenMP/wsloop-unstructured.f90 (+63)
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
                successor == eval.controlSuccessor)
         // Exit from a degenerate, empty construct block.
         genBranch(eval.parentConstruct->constructExit->block);
+      else if (successor->block)
+        // Catch case with omp collapse(n) when dealing with an end-do statement
+        // which has been collapsed into an outer loop and unstructured control
+        // flow is used. Create a fall through branch to the successor (the
+        // outer loop). This is not known to affect other cases.
+        genBranch(successor->block);
     }
   }
 
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+  integer, intent(in) :: imax, jmax
+  real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+  integer :: i, j, ii
+
+  ! collapse(2) is needed to reproduce the issue
+  !$omp parallel do collapse(2)
+  do j = 1, jmax
+    do i = 1, imax
+      do  ii = 1, imax ! note that this loop is not collapsed
+        if (x(i,j) < y(ii,j)) then
+          ! exit needed to force unstructured control flow
+          exit
+        endif
+      enddo
+    enddo
+  enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL:   func.func @_QPsub(
+! CHECK-SAME:                      %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME:                      %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME:                      %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME:                      %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK:             omp.wsloop for  (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK:               cf.br ^bb1
+! CHECK:             ^bb1:
+! CHECK:               cf.br ^bb2
+! CHECK:             ^bb2:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb3:
+! [...]
+! CHECK:               %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK:               cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK:             ^bb4:
+! [...]
+! CHECK:               %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK:               cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK:             ^bb5:
+! CHECK:               cf.br ^bb7
+! CHECK:             ^bb6:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb7:
+! CHECK:               cf.br ^bb8
+! CHECK:             ^bb8:
+! CHECK:               omp.yield
+! CHECK:             }
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           cf.br ^bb1
+! CHECK:         ^bb1:
+! CHECK:           return
+! CHECK:         }

@vdonaldson
Copy link
Contributor

[I'm on vacation until 1.16; responses may be slow]

Have you tried other code variations? An exit from or to an intermediate or outer loop in addition to an exit from the innermost loop? An exit that is not at the end of the loop body? Unstructured branches other than exit, such as a simple goto?

Since this is only needed for unstructured code, please add that to the condition, as in the prior degenerate/empty construct case. That also serves to document that restriction.

Is there some simple condition that can be checked to determine that the context is a collapsed loop? If yes, please consider adding that.

Please consider writing a more focused comment. One possibility: // Exit from a collapsed loop nest.

@tblah
Copy link
Contributor Author

tblah commented Jan 9, 2024

Thank you for taking a look. After looking more closely at more code examples, I think I will need to take a different approach to support collapse(n) for n > 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants