Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ice_boundary.F90 internal compiler error under nvhpc 23.11 compiler #23

Closed
areanddee opened this issue Jan 25, 2024 · 5 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@areanddee
Copy link
Contributor

areanddee commented Jan 25, 2024

Background: The CICE model in CESM has a compilation failure under nvhpc 23.11 when running F2000dev in CESM. This was discovered apropos of trying to reproduce #PR881 and #PR945 (aerosol_optics_cam.F90 compile and run issues) on Perlmutter using OpenMPI message passing.

The CICE model is used in the CESM model. It is not used in the fully coupled F2000dev compsets in the tested subset of supported EarthWorks configurations.

What goes wrong: The warnings/errors that have been noted from the build log for these runs include:

  1. An MPI warning in ice_global_reductions.F90:

NVFORTRAN-W-0189-Argument number 2 to mpi_allreduce: association of scalar actual argument to array dummy argument (/global/u1/c/cponder/PerlMutter/Applications/CAM/2024-01-11/components/cice/src/cicecore/cicedynB/infrastructure/comm/mpi/ice_global_reductions.F90: 2283)
0 inform, 2 warnings, 0 severes, 0 fatal for global_minval_scalar_int_nodist
Timing stats:
init 80 millisecs 27%
upper 90 millisecs 31%
expand 30 millisecs 10%
carry 30 millisecs 10%
schedule 50 millisecs 17%
assemble 10 millisecs 3%
Total time 290 millisecs

Analysis: This may be an issue, but not clear how it would relate to the second more severe error.

The second, and more critical issue, is an internal compiler error on line 1381 of /components/cice/src/cicecore/cicedynB/infrastructure/comm/mpi/ice_boundary.F90
Error message:

Lowering Error: bad ast optype in expression [ast=10110,asttype=12,datatype=0]
Lowering Error: bad ast optype in expression [ast=10108,asttype=12,datatype=0]
Lowering Error: bad ast optype in expression [ast=9470,asttype=38,datatype=0]
NVFORTRAN-S-0000-Internal compiler error. lower_sptr: bad sptr 1 (/global/u1/c/cponder/PerlMutter/Applications/CAM/2024-01-11/components/cice/src/cicecore/cicedynB/infrastructure/comm/mpi/ice_boundary.F90: 1381)
NVFORTRAN-F-0000-Internal compiler error. Errors in Lowering 3 (/global/u1/c/cponder/PerlMutter/Applications/CAM/2024-01-11/components/cice/src/cicecore/cicedynB/infrastructure/comm/mpi/ice_boundary.F90: 1381)
NVFORTRAN/x86-64 Linux 23.11-0: compilation aborted
gmake: *** [/pscratch/sd/c/cponder/SMS_Ln9.f19_f19_mg17.F2000dev.perlmutter_nvhpc.cam-outfrq9s.20240111_194829_l7aiqu/Tools/Makefile:978: ice_boundary.o] Error 2

According to some internet research, Lowering Errors are generally not very useful as diagnostics. FWIW, the offending code pointed at by the compiler is listed here for reference:

!>>> line 1381 follows:
do iblk = 1, halo%numLocalBlocks
call get_block_parameter(halo%blockGlobalID(iblk), &
ilo=ilo, ihi=ihi, &
jlo=jlo, jhi=jhi)
do j = 1,nghost
array(1:nx_block, jlo-j,iblk) = fill
array(1:nx_block, jhi+j,iblk) = fill
enddo
do i = 1,nghost
array(ilo-i, 1:ny_block,iblk) = fill
array(ihi+i, 1:ny_block,iblk) = fill
enddo
enddo

The subroutine get_block parameter is located inside ice_blocks.F90 module.

Reproducing: We are working on generating a procedure on either Perlmutter or Derecho for reproducing the ice_boundary.F90 issue.

More directly related to EarthWorks, we plan to run F2000dev smoke test with patched aerosols in EarthWorks configuration with MPAS_SeaIce. This may work under nvhpc 23.11, but we have to verify.

@dazlich
Copy link
Contributor

dazlich commented Feb 20, 2024

I am implementing prescribed seaice cover in MPASSI so that it can be run in an F2000climo run.

Currently, the F2000climo compset is short for 2000_CAM60_CLM50%SP_CICE%PRES_DOCN%DOM_MOSART_CISM2%NOEVOLVE_SWAV

When this implementation is done, one will specify --compset 2000_CAM60_CLM50%SP_MPASSI%PRES_DOCN%DOM_MOSART_CISM2%NOEVOLVE_SWAV in create_newcase to do an F2000climo case that doesn't use CICE

@dazlich
Copy link
Contributor

dazlich commented Mar 7, 2024

The prescribed ice cover mode for MPASSO is working functionally for F2000climo. The annual cycle repeats properly. Climatology variables like surface temperature appear similar compared to CICE. One discrepancy is that the prescribed ice cover maps are not quite the same between CICE and MPASSI. I believe the MPASSI fields are slightly shifted in time compared to CICE and I'd like to get them to line up.

@areanddee - Would you like the current state of MPASSI to get into the patch and let fix for the discrepancies get in later, or should we pass over the prescribed mode in MPASSI for the patch and do it all later?

@areanddee
Copy link
Contributor Author

areanddee commented Mar 7, 2024 via email

@dazlich
Copy link
Contributor

dazlich commented Mar 7, 2024 via email

@gdicker1 gdicker1 added the bug Something isn't working label Mar 7, 2024
@gdicker1
Copy link
Contributor

Closed by #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants