Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences between model versions compiled with ifort, gfortran, and pgi #12

Open
lizaclark opened this issue Feb 18, 2017 · 6 comments

Comments

@lizaclark
Copy link
Contributor

This issue addresses differences between compilations (without explicit double precision flags) of the SHARP branch. The case with double precision flags will be addressed in a separate issue. I used three compilers on hydro-c1 at UCAR with the following flags:

  1. ifort (IFORT) 15.0.2 20150121
    sac_77: ifort -O3 -f77rtl
    snow19: ifort –O3
    UTIL: ifort -O3 -warn all -check all
  2. GNU Fortran (Debian 4.9.2-10) 4.9.2
    sac_77: gfortran -O3 -fno-align-commons -ffree-line-length-none flags77
    snow19: gfortran -O3 -fno-align-commons -ffree-line-length-none flags2
    UTIL: gfortran -O3 -fno-align-commons -ffree-line-length-none
  3. PGI 15.7-0 64-bit target on x86-64 Linux -tp sandybridge
    sac_77: pgf77 -O3
    snow19: pgf90 –O3
    UTIL: pgf90 -O3 -Kieee

Note that the Makefile in this commit does not use any flags for PGI because the FC and FC77 variables do not match pgf90 or pgf77 as expected in the if statements that turn on these flags.

These compilers produced different results, particularly in the wintertime.
swe_orig_diff
Fig. 1. SWE timeseries: The lower set of lines shows swe in mm, plotted on the left axis. The upper set of lines shows the difference in swe in mm between the simulation using PGI and the other two compilers.

@anewman89 suggested checking a change he had made to include a tiny offset in aesc19.f for when the precision of an ascii restart file caused problems in the comparison. tiny is an intrinsic function with tiny(x) - Returns the smallest positive number that can be represented on the current computer for real argument x, but in the current commit, tiny does not call an argument x; it is treated as an uninitialized variable.

When I tried to print out the values of tiny to the screen, the PGI compiled code produced different results than when PGI compiled the same code without a print statement. This suggests a memory error.. I ran the program with valgrind, which identified a leak at the line where tiny is used:

==40656== Conditional jump or move depends on uninitialised value(s)
==40656==    at 0x40421E: aesc19_ (aesc19.f:25)
==40656==    by 0x402857: pack19_ (PACK19.f:218)
==40656==    by 0x40C252: exsnow19_ (exsnow19.f:142)
==40656==    by 0x40F2ED: MAIN_ (multi_driver.f90:282)
==40656==    by 0x401CC3: main (in ~/NWS_hydro_models/Snow17SacUH/src_bin/Snow17SacUH.exe)

The values printed for tiny were inconsistent. Here is a subset of the values printed to the screen in valgrind:

 tiny=  -1.7005811E+38
 tiny=    0.000000    
 tiny=  -1.7005811E+38
 tiny=   8.5123166E-02
 tiny=  -1.7005811E+38
 tiny=   8.3787210E-02
 tiny=  -1.7005811E+38
 tiny=   0.1242791    
 tiny=  -1.7005811E+38
 tiny=    0.000000    
 tiny=  -1.7005811E+38
 tiny=   0.1852178    
 tiny=  -1.7005811E+38
 tiny=    0.000000 

Three equivalent independent solutions that successful matched the PGI compiled output to that of the other compilers (bringing the black line down to match the red and blue lines in Fig. 1, and reproducing simulated streamflow to within +/-0.01 cfs) were:

  1. Use the -Msave flag on f90 and f77: This initializes all undefined values at zero.
  2. Set tiny = 0.0 before line 15 in aesc.f.
  3. Remove tiny completely.

The consistency of these options with ifort and gfortran implies that ifort and gfortran initialize tiny to 0.0 when compiled.

@andywood noted that the precision of the ascii state files has been increased since @anewman89 ran into the issue that required tiny in the first place. I am doing frequent restarts in my application, so I tested that a version of the code without tiny still has exact restarts. I did this by running two simulations from 2005-10-01 to 2006-09-30:

  1. a continuous run for the entire period
  2. a run that stops and restarts from a state file every day.

I compared the output files from the two scenarios, and they are equivalent. This suggests that for better reproducibility, tiny can and should be removed. Doing so is conveniently also truer to the original code.

@andywood
Copy link
Collaborator

andywood commented Feb 18, 2017 via email

@anewman89
Copy link
Collaborator

I agree with Andy's comments, with this and issue #13, going back to the original code and no double precision flags is probably the best route to go.

@lizaclark
Copy link
Contributor Author

lizaclark commented Feb 18, 2017

@andywood:
I attempted to compile and run the model using FC77EXE in place of FCEXE and F77FLAGS in place of FLAGS and FLAGS2 to compile with each of PGI, ifort, and gfortranfor the version without tiny. The runs use the same namelist and input files as the previous runs described in issues #12 and #13.

  1. The PGI compiler returns this error:
/home/eclark/NWS_hydro_models/Snow17SacUH/src_bin//snow19/exsnow19.f:
PGFTN-S-0083-Vector expression used where scalar expression required (/home/eclark/NWS_hydro_models/Snow17SacUH/src_bin//snow19/exsnow19.f: 130)
PGFTN-S-0083-Vector expression used where scalar expression required (/home/eclark/NWS_hydro_models/Snow17SacUH/src_bin//snow19/exsnow19.f: 134)
  0 inform,   0 warnings,   2 severes, 0 fatal for exsnow19
Makefile:181: recipe for target 'compile_calib' failed
make: *** [compile_calib] Error 2
  1. ifort compiles but fails to run with:
 ***** ERROR: Problem reading namelist INIT_CONTROL

forrtl: severe (24): end-of-file during read, unit 30, file /home/eclark/NWS_hydro_models/Snow17SacUH/src_bin/driver/namelist.test, line 41, position 1
Image              PC                Routine            Line        Source             
Snow17SacUH.exe    000000000041BFD4  Unknown               Unknown  Unknown
Snow17SacUH.exe    000000000043D005  Unknown               Unknown  Unknown
Snow17SacUH.exe    0000000000417C17  Unknown               Unknown  Unknown
Snow17SacUH.exe    000000000040C57D  Unknown               Unknown  Unknown
Snow17SacUH.exe    000000000040263E  Unknown               Unknown  Unknown
libc.so.6          00002AFD1BD1EB45  Unknown               Unknown  Unknown
Snow17SacUH.exe    0000000000402539  Unknown               Unknown  Unknown
  1. gfortran compiles and runs to completion, but this is not surprising because the flags for F77FLAGS = FLAGS = FLAGS2 in the Makefile for this compiler option.

@andywood
Copy link
Collaborator

andywood commented Feb 19, 2017 via email

@anewman89
Copy link
Collaborator

Looks like the compile error is in the original code. For example, it allocates SMFV as an array on line 18, then on line 130 it initializes it to zero: SMFV=0 This is allowed in F90+, must not be in F77. Interesting that ifort allows it, but that may be related to why it generates a runtime error?

@andywood
Copy link
Collaborator

andywood commented Feb 19, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants