Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message is not returned correctly by gfortran 9+ #24

Closed
dhnza opened this issue Feb 17, 2021 · 10 comments
Closed

Error message is not returned correctly by gfortran 9+ #24

dhnza opened this issue Feb 17, 2021 · 10 comments

Comments

@dhnza
Copy link

dhnza commented Feb 17, 2021

This is related to GNU 93762.

In short, when passing optional deferred-length strings through multiple calls, gfortran doesn't propagate the string's length up the chain. The string length at the top of the chain is uninitialized, which can lead to allocation errors like

Operating system error: Cannot allocate memory
Memory allocation failure in xrealloc

This bug is triggered whenever an errmsg is returned by one of the parameter_list procedures, for example

call params%get('foo', x, stat=stat, errmsg=errmsg)   ! 'foo' is not in the parameter list
if (stat/=0) print *, errmsg                          ! error

If integers are default initialized to 0, such as with the flag -finit-integer=0, then the above error is not catastrophic but an empty string is returned.

@dhnza
Copy link
Author

dhnza commented Feb 17, 2021

As an example, when the error subroutine is called by get_scalar_int32 in parameter_list_type, the compiler bug is triggered because the errmsg argument is optional in both subroutines.

Here's a minimal reproducer that mimics the get_scalar -> error call chain mentioned above.

module foo
contains
  subroutine set(msg)
    character(:), allocatable, intent(out), optional :: msg
    if (present(msg)) msg = 'foo'
  end subroutine set

  subroutine wrap(msg)
    character(:), allocatable, intent(out), optional :: msg
    call set(msg)
  end subroutine wrap
end module foo

program minimal
  use foo
  character(:), allocatable :: msg
  call wrap(msg)
  print *, msg
end program minimal

Note that removing the optional attribute from either subroutine solves the issue.

So one possible fix is to have two error subroutines, one with a non-optional errmsg argument, and one without the argument. One level up the chain, call the right error routine based on whether errmsg is present or not.

@nncarlson
Copy link
Owner

Thanks for the report David. I'll see what I can do.

@nncarlson
Copy link
Owner

I've been digging into this problem. For poking around I've taken your example and added another level of "pass-through" call like your wrap to give the more general situation. And yes, the error is triggered by a call where the dummy argument is an optional deferred-length allocatable character, and the corresponding actual argument is likewise an optional deferred-length character in the calling routine. The length of the allocated character is not being propagated from the dummy back to the actual. In any other situation things seems to work okay.

I had hoped to be able to perhaps fiddle with the error routine (by possibly adding calls to auxiliary procedures with non-optional args) to fix things since that's a single point in the flow. But that's not possible. The solution you propose would take care of the "wrap -> set" call, but that involves making workarounds in dozens of places, and if only fixes things for one hierarchy of calls.

Another fix (equally obnoxious) is to always pass a local deferred-length allocatable character as the actual argument, and then copy the returned temporary to the original optional dummy variable.

So I'm not sure how I'm going to handle this. I'm tempted to alter the API for gfortran to make the error handling arguments non-optional. That hack may be cleaner. The bigger issue for us is this gfortran bug affects Truchas as well; we make a lot of use of optional deferred-length allocatable character arguments.

@nncarlson
Copy link
Owner

Incidentally, in my poking around I was printing the address of the msg argument as it was passed down through the calls and back up,

      print '(z16.16)', transfer(c_loc(msg), 1_int64)

and it was curious that the address was always the same (and it was the start of the actual string). It was if the generated code was treating msg as non-allocatable, in part. When I removed the optional bit, then the address started changing as expected for an allocatable.

@nncarlson
Copy link
Owner

@dhnza I turned your example to a test case. It includes an example of the workaround I mentioned.

@dhnza
Copy link
Author

dhnza commented Mar 19, 2021

I looked into this again, but couldn't think of any better workarounds than the ones you pointed out. At least not any fully safe ones.

There's a simple fix, but it relies on undefined behavior. By removing the optional attribute from the sub2 function in your test case, everything works. The problem, of course, is that it will segfault in the case that no msg is passed.

However, this may be an ok solution for Truchas if we know that an actual errmsg is only passed if an actual stat is passed. We could then guard the write to the errmsg with a check for a present stat. So the error routine in paramter_list_type.F90 would become

  subroutine error (errmsg_, stat, errmsg)
    use,intrinsic :: iso_fortran_env, only: error_unit
    character(*), intent(in) :: errmsg_
    integer, intent(out), optional :: stat
    character(:), allocatable, intent(out) :: errmsg
    if (present(stat)) then
      stat = 1
      if (present(stat)) errmsg = errmsg_
    else
      write(error_unit,'(a)') 'ERROR: ' // errmsg_
      stop 1
    end if
  end subroutine error

It's kind of janky, but I think it should work given the above condition on stat and errmsg. The good thing about this fix is that you only need to change the error routine. I'm guessing that the standard allows passing a dummy optional argument as an actual non-optional argument, or at least gfortran compiles it.

@nncarlson
Copy link
Owner

I think this is a great solution @dhnza. The bug arises in an "A calls B" situation where errmsg is optional in both. Fortunately it looks like there is only one level of such calls in parameter_list_type.F90 (with B being the error routine and A an interface routine), so this modification side steps the bug. Moreover it's simple to explain to a gfortran user: if the optional argument stat is passed you must also pass the optional errmsg argument. If they violate that and an error occurs it will segfault. My fix would prevent that by being a compile-time check, but it was turning out to be incredibly messy and I abandoned it 3/4 of the way through. This possibility of segfaulting if the constraint isn't followed I can easily live with for gfortran.

@nncarlson
Copy link
Owner

Argh. I spoke too soon. Compiles fine, but segfaults on the call to error when it goes to setup the call frame for errmsg when it isn't present (e.g., if neither stat or errmsg is passed).

@nncarlson
Copy link
Owner

Actually it does work with a slight tweak! The segfault is apparently coming from the deallocation of errmsg that happens on entry to error due to the intent(out). Dropping the intent(out) fixes things.

nncarlson added a commit that referenced this issue Mar 31, 2021
Workaround for gfortran bug (resolves #24)
@dhnza
Copy link
Author

dhnza commented Apr 2, 2021

Great, I'm glad the fix worked out in the end!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants