Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN processing in adaptive_inflate_mod.f90 #176

Closed
timhoar opened this issue Mar 9, 2021 · 7 comments
Closed

NaN processing in adaptive_inflate_mod.f90 #176

timhoar opened this issue Mar 9, 2021 · 7 comments
Labels
fortran standards compiler issues and (non) standard-compliant code

Comments

@timhoar
Copy link
Contributor

timhoar commented Mar 9, 2021

We employ several different methods to test for NaNs, not sure which one is best, or if we need to be consistent.

In adaptive_inflate_mod.f90 there is code:

   ! Computational errors check (small numbers + NaNs)
   if (abs(density_1) <= TINY(0.0_r8) .OR. &
       abs(density_2) <= TINY(0.0_r8) .OR. &
       density_1 /= density_1 .OR. density_1 /= density_1 .OR. &
       density_2 /= density_2 .OR. density_2 /= density_2) then
      new_cov_inflate_sd = lambda_sd
      return
   endif

This code does the same test on density_1 - twice - and density_2 - twice - but before changing the code, I want to know what method to use. In other codes, we have employed the test
if (var < 0.0_r8 .and. var > 0.0_r8) to test for NaNs.

@nancycollins
Copy link
Collaborator

i think (var /= var) is an ok test. the duplicates aren't needed but they don't hurt anything.

that other line is not quite the right test in your note above.
var, if NaN, always fails tests, so that line as written would evaluate as:

if (.false. .and. .false.) then nan ! which will never be true.

it would have to be:

if ((.not. var <= 0.0_r8 ).and. ( .not. var > 0.0_r8)) then nan
! note <= and > because otherwise var == 0 would wrongly indicate nan

which isn't the most direct thing to read but works.

@nancycollins
Copy link
Collaborator

something about this issue was bugging me but i couldn't figure out what it was. i finally figured it out.

regular code should try very hard to never generate NaNs ever. it shouldn't then need to test for them.

if you do suspect nans are being generated, you should test for them and then error out immediately. then figure out how to avoid the case that led to the nan.

here's why. there are compile-time flags that will trap when a nan is used as an operand. this is a very helpful way to debug an unexpected nan. but if we have code that routinely generates nans and then replaces the bad values with good, the program will stop on this nan if it happens before the unexpected nan. it makes debugging very much harder.

@hkershaw-brown
Copy link
Member

@nancycollins (sanity) check this out: (NaN /= NaN) is false for pgi/nvfortran unless you compile with the flag -Kieee

hkershaw@cheyenne5:$ cat test_nan.f90 
program test_nan

implicit none

real :: a, x

x = -1
a = sqrt(x)

print*, a

print*, a /= a
print*, a == a

if (a /= a) then
  print*, "you have a NaN"
endif

end program test_nan
hkershaw@cheyenne5:$ gfortran test_nan.f90 
hkershaw@cheyenne5:$ ./a.out 
              NaN 
 T
 F
 you have a NaN 
hkershaw@cheyenne5:$ nvfortran test_nan.f90 
hkershaw@cheyenne5:$ ./a.out 
             NaN 
  F
  T
hkershaw@cheyenne5:$ nvfortran -Kieee test_nan.f90 
hkershaw@cheyenne5:$ ./a.out 
             NaN 
  T
  F
 you have a NaN 

@nancycollins
Copy link
Collaborator

my understanding was always that any kind of test that included a NaN as at least one of the operations returned false. but the test (a /= a) is like asking is "a" Not Not equal to "a"? so maybe returning true in that case is ok and i could live with gfortran's choices. (ifort is the same - see below) but nvfortran?

i really don't like nvfortran saying (a == a) is true even if a is NaN. i wonder what the performance penalty for always compiling with -Kieee would be?

for completeness, ifort on cheyenne gives:

cheyenne1> ifort -o bob bob.f90
cheyenne1> ./bob
NaN
T
F
you have a NaN

@hkershaw-brown
Copy link
Member

my understanding was always that any kind of test that included a NaN as at least one of the operations returned false.

me too, that is IEEE 754. But the nvidia default is not to use IEEE 754 ( I think other compilers you would have to set -fastmath to do this).
https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-ref-guide/#k-flag

Anyway my brain just dribbled out of my ears running this test. It makes moving to Derecho a lot more interesting.

@nancycollins
Copy link
Collaborator

i heard in irfan's talk this morning that they're trying to get derecho up and running by the end of this year, with early adopters getting access in spring 2023. maybe the compiler defaults will change by then?

@hkershaw-brown hkershaw-brown reopened this Apr 4, 2022
@hkershaw-brown hkershaw-brown added the fortran standards compiler issues and (non) standard-compliant code label Apr 5, 2022
hkershaw-brown added a commit to Benjamin-Gunn/DART that referenced this issue Apr 5, 2022
@hkershaw-brown
Copy link
Member

Added -Kieee to pgi mkmf.template in 52a7056

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fortran standards compiler issues and (non) standard-compliant code
Projects
None yet
Development

No branches or pull requests

3 participants