Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Produce warning instead of error when bound checking arrays #83011

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mjklemm
Copy link
Contributor

@mjklemm mjklemm commented Feb 26, 2024

Fixes issue #82684.

While out-of-bounds accesses are not permitted by the Fortran standard, the compiler should not produce a fatal error but emit a warning. This is inline with other compilers and can easily be turned into an error via -Werror.

@mjklemm mjklemm added flang Flang issues not falling into any other category flang:semantics labels Feb 26, 2024
@mjklemm mjklemm self-assigned this Feb 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 26, 2024

@llvm/pr-subscribers-flang-semantics

Author: Michael Klemm (mjklemm)

Changes

Fixes issue #82684.

While out-of-bounds accesses are not permitted by the Fortran standard, the compiler should not produce a fatal error but emit a warning. This is inline with other compilers and can easily be turned into an error via -Werror.


Full diff: https://github.com/llvm/llvm-project/pull/83011.diff

2 Files Affected:

  • (modified) flang/lib/Semantics/expression.cpp (+5-6)
  • (modified) flang/test/Semantics/expr-errors06.f90 (+13-13)
diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp
index 8d817f077880b9..48b23f82291e42 100644
--- a/flang/lib/Semantics/expression.cpp
+++ b/flang/lib/Semantics/expression.cpp
@@ -416,12 +416,11 @@ void ExpressionAnalyzer::CheckConstantSubscripts(ArrayRef &ref) {
           msg =
               "Subscript %jd is greater than upper bound %jd for dimension %d of array"_err_en_US;
           bound = *dimUB;
-          if (dim + 1 == arraySymbol.Rank() && IsDummy(arraySymbol) &&
-              *bound == 1) {
-            // Old-school overindexing of a dummy array isn't fatal when
-            // it's on the last dimension and the extent is 1.
-            msg->set_severity(parser::Severity::Warning);
-          }
+          // Even though this constitudes a violation of the Fortran
+          // standard, we should use a warning instead of a fatal error.
+          // This is in line with other compilers and can be turned
+          // into a fatal error using -Werror.
+          msg->set_severity(parser::Severity::Warning);
         }
         if (msg) {
           AttachDeclaration(
diff --git a/flang/test/Semantics/expr-errors06.f90 b/flang/test/Semantics/expr-errors06.f90
index 84872c7fcdbc58..c0293104daf01f 100644
--- a/flang/test/Semantics/expr-errors06.f90
+++ b/flang/test/Semantics/expr-errors06.f90
@@ -7,35 +7,35 @@ subroutine subr(da)
   !ERROR: DATA statement designator 'a(0_8)' is out of range
   !ERROR: DATA statement designator 'a(11_8)' is out of range
   data a(0)/0./, a(10+1)/0./
-  !ERROR: Subscript 0 is less than lower bound 1 for dimension 1 of array
+  !WARNING: Subscript 0 is less than lower bound 1 for dimension 1 of array
   print *, a(0)
-  !ERROR: Subscript 0 is less than lower bound 1 for dimension 1 of array
+  !WARNING: Subscript 0 is less than lower bound 1 for dimension 1 of array
   print *, a(1-1)
-  !ERROR: Subscript 11 is greater than upper bound 10 for dimension 1 of array
+  !WARNING: Subscript 11 is greater than upper bound 10 for dimension 1 of array
   print *, a(11)
-  !ERROR: Subscript 11 is greater than upper bound 10 for dimension 1 of array
+  !WARNING: Subscript 11 is greater than upper bound 10 for dimension 1 of array
   print *, a(10+1)
-  !ERROR: Subscript value (0) is out of range on dimension 1 in reference to a constant array value
+  !WARNING: Subscript value (0) is out of range on dimension 1 in reference to a constant array value
   print *, n(0)
-  !ERROR: Subscript value (3) is out of range on dimension 1 in reference to a constant array value
+  !WARNING: Subscript value (3) is out of range on dimension 1 in reference to a constant array value
   print *, n(4-1)
   print *, a(1:12:3) ! ok
-  !ERROR: Subscript 13 is greater than upper bound 10 for dimension 1 of array
+  !WARNING: Subscript 13 is greater than upper bound 10 for dimension 1 of array
   print *, a(1:13:3)
   print *, a(10:-1:-3) ! ok
-  !ERROR: Subscript -2 is less than lower bound 1 for dimension 1 of array
+  !WARNING: Subscript -2 is less than lower bound 1 for dimension 1 of array
   print *, a(10:-2:-3)
   print *, a(-1:-2) ! empty section is ok
   print *, a(0:11:-1) ! empty section is ok
-  !ERROR: Subscript 0 is less than lower bound 1 for dimension 1 of array
+  !WARNING: Subscript 0 is less than lower bound 1 for dimension 1 of array
   print *, a(0:0:unknown) ! lower==upper, can ignore stride
-  !ERROR: Subscript 11 is greater than upper bound 10 for dimension 1 of array
+  !WARNING: Subscript 11 is greater than upper bound 10 for dimension 1 of array
   print *, a(11:11:unknown) ! lower==upper, can ignore stride
-  !ERROR: Subscript 0 is less than lower bound 1 for dimension 1 of array
+  !WARNING: Subscript 0 is less than lower bound 1 for dimension 1 of array
   print *, da(0,1)
-  !ERROR: Subscript 3 is greater than upper bound 2 for dimension 1 of array
+  !WARNING: Subscript 3 is greater than upper bound 2 for dimension 1 of array
   print *, da(3,1)
-  !ERROR: Subscript 0 is less than lower bound 1 for dimension 2 of array
+  !WARNING: Subscript 0 is less than lower bound 1 for dimension 2 of array
   print *, da(1,0)
   !WARNING: Subscript 2 is greater than upper bound 1 for dimension 2 of array
   print *, da(1,2)

Copy link
Contributor

@klausler klausler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What application requires this change?

@mjklemm
Copy link
Contributor Author

mjklemm commented Feb 26, 2024

We have one customer workload that we cannot disclose publicly (it's sort of a cummunity code in Europe), plus NWChem has similar patterns (even though they go undetected). GFortran also just warns and does not error out by default.

@klausler
Copy link
Contributor

We have one customer workload that we cannot disclose publicly (it's sort of a cummunity code in Europe), plus NWChem has similar patterns (even though they go undetected). GFortran also just warns and does not error out by default.

@jeffhammond is this true about nwchem?

@jeffhammond
Copy link
Member

jeffhammond commented Feb 26, 2024

if it's true in NWChem, it's a bug and we will fix it. i want to know about these errors by default. i don't recall this issue when i compiled NWChem with flang-new, FWIW.

@mjklemm
Copy link
Contributor Author

mjklemm commented Feb 26, 2024

if it's true in NWChem, it's a bug and we will fix it. i want to know about these errors by default. i don't recall this issue when i compiled NWChem with flang-new, FWIW.

It's the usage of mb_dbl and its friends. All their accesses through out the code are outside of the their definition, which is mb_dbl(2) in a common block.

Oh yes, these go undetected, as their are based on accesses via variables not constants. But still...

@jeffhammond
Copy link
Member

jeffhammond commented Feb 26, 2024

ugh, if dbl_mb triggers this then all of NWChem is broken with bounds checking.

@klausler for context, dbl_mb is used to implement a heap allocator. it's a common block member and C code increases the size of it, in spite of its declaration. the whole thing is trash but rewriting to use allocate would take years.

in theory, we can increase the size by some amount that makes it much harder for the compiler to prove OOB errors (that aren't really errors, thanks to C) but i don't know how well this will work.

@mjklemm
Copy link
Contributor Author

mjklemm commented Feb 26, 2024

ugh, if dbl_mb triggers this then all of NWChem is broken with bounds checking.

NWChem does not trigger the warning, as the compiler is not smart enough. But that does not mean the problem does not exist. So, yes, the problem is there for NWChem and it is badly UB all over the place.

@klausler for context, dbl_mb is used to implement a heap allocator. it's a common block member and C code increases the size of it, in spite of its declaration. the whole thing is trash but rewriting to use allocate would take years.

in theory, we can increase the size by some amount that makes it much harder for the compiler to prove OOB errors (that aren't really errors, thanks to C) but i don't know how well this will work.

So, even though the problem does not show up, I'd argue that a less harsh reaction of Flang on case where it can detect that seems useful, given that other compilers are either completely silent about this detectable case (ifort, ifx) or prefer to warn (gfortran).

@klausler
Copy link
Contributor

So, even though the problem does not show up, I'd argue that a less harsh reaction of Flang on case where it can detect that seems useful, given that other compilers are either completely silent about this detectable case (ifort, ifx) or prefer to warn (gfortran).

It's a fatal error by default with better compilers (NAG & XLF).

@jeffhammond
Copy link
Member

indeed, i agree it should be fatal by default because most of the uses are bad and should be trapped.

@kkwli
Copy link
Collaborator

kkwli commented Feb 26, 2024

So, even though the problem does not show up, I'd argue that a less harsh reaction of Flang on case where it can detect that seems useful, given that other compilers are either completely silent about this detectable case (ifort, ifx) or prefer to warn (gfortran).

It's a fatal error by default with better compilers (NAG & XLF).

Not really for XLF.

$ cat tt.f90

program main
  implicit none
 
  integer, parameter :: N = ONE_OR_TWO
  integer :: array(N)
 
  array(1) = 1
  if (N > 1) then
    array(2) = 2
  endif
 
  write(*,*) array
 
end program main

$ xlf90 -qpreprocess -DONE_OR_TWO=1 tt.f90
"tt.f90", line 10.5: 1516-023 (E) Subscript is out of bounds.
** main   === End of Compilation 1 ===
1501-510  Compilation successful for file tt.f90.

@mjklemm
Copy link
Contributor Author

mjklemm commented Feb 26, 2024

I just came across gcc behavior:

/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h: In member function ‘llvm::Error llvm::pdb::HashTable<ValueT>::load(llvm::BinaryStreamReader&) [with ValueT = llvm::support::detail::packed_endian_specific_integral<unsigned int, llvm::endianness::little, 1>]’:
/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h:130:15: warning: array subscript 2 is above array bounds of ‘const BitWord [2]’ {aka ‘const long unsigned int [2]’} [-Warray-bounds]
  130 |       if (Bits[i] != 0)
      |           ~~~~^
/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h:55:11: note: while referencing ‘llvm::SparseBitVectorElement<128>::Bits’
   55 |   BitWord Bits[BITWORDS_PER_ELEMENT];
      |           ^~~~

So, Flang behavior would be consistent with gcc and g++, too.

@klausler
Copy link
Contributor

I just came across Clang's behavior:

/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h: In member function ‘llvm::Error llvm::pdb::HashTable<ValueT>::load(llvm::BinaryStreamReader&) [with ValueT = llvm::support::detail::packed_endian_specific_integral<unsigned int, llvm::endianness::little, 1>]’:
/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h:130:15: warning: array subscript 2 is above array bounds of ‘const BitWord [2]’ {aka ‘const long unsigned int [2]’} [-Warray-bounds]
  130 |       if (Bits[i] != 0)
      |           ~~~~^
/home/micklemm/git/llvm/llvm-project/llvm/include/llvm/ADT/SparseBitVector.h:55:11: note: while referencing ‘llvm::SparseBitVectorElement<128>::Bits’
   55 |   BitWord Bits[BITWORDS_PER_ELEMENT];
      |           ^~~~

So, Flang behavior would be consistent with its C/C++ counterpart.

Fortran is neither C nor C++.

@mjklemm
Copy link
Contributor Author

mjklemm commented Feb 26, 2024

Fortran is neither C nor C++.

I understand that, even though sometimes I misplace ';' in the code. However, it seems to hint at a certain behavior that users may expect from other compilers and that can make it easier when when moving between compilers and languages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:semantics flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants