Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/ISSUE] GCHP crash in dev/gchp_13.0.0 when trying to use the .grid_label field in HISTORY.rc #59

Closed
jmoch1214 opened this issue Nov 6, 2020 · 1 comment
Labels
category: Bug Something isn't working

Comments

@jmoch1214
Copy link

jmoch1214 commented Nov 6, 2020

Describe the bug:

GCHP crashes when I try to use a ".grid_label" and ".conservative" fields for any collection

Expected behavior:

GCHP runs sucessfully and regrids the output from native cubed sphere to the lat-lon grid.

Actual behavior:

GCHP crashes and has errors poinitng to MAPL (e.g. MAPL_HistoryGridComp.F90, MAPL_Generic.F90, etc.)

Steps to reproduce: the bug:

I used cmake and ifort 18. The standard environment used by Lizzie Lundgren.

Run commands

I used the gchp.run script (single run)

Error messages

Nothing says "add text here" but a lot of messages say "need informative message"
 
pe=00000 FAIL at line=01064    MAPL_HistoryGridComp.F90                 <needs informative message>
pe=00000 FAIL at line=01829    MAPL_Generic.F90                         <needs informative message>
pe=00000 FAIL at line=00614    MAPL_CapGridComp.F90                     <status=1>
pe=00000 FAIL at line=00559    MAPL_CapGridComp.F90                     <status=1>
pe=00001 FAIL at line=01064    MAPL_HistoryGridComp.F90                 <needs informative message>
pe=00001 FAIL at line=01829    MAPL_Generic.F90                         <needs informative message>
pe=00001 FAIL at line=00614    MAPL_CapGridComp.F90                     <status=1>
pe=00001 FAIL at line=00559    MAPL_CapGridComp.F90                     <status=1>
pe=00001 FAIL at line=00849    MAPL_CapGridComp.F90                     <status=1>
pe=00001 FAIL at line=00322    MAPL_Cap.F90                             <status=1>
pe=00001 FAIL at line=00198    MAPL_Cap.F90                             <status=1>
pe=00001 FAIL at line=00157    MAPL_Cap.F90                             <status=1>
pe=00002 FAIL at line=01064    MAPL_HistoryGridComp.F90                 <needs informative message>
pe=00002 FAIL at line=01829    MAPL_Generic.F90                         <needs informative message>
pe=00002 FAIL at line=00614    MAPL_CapGridComp.F90                     <status=1>
pe=00002 FAIL at line=00559    MAPL_CapGridComp.F90                     <status=1>
pe=00002 FAIL at line=00849    MAPL_CapGridComp.F90                     <status=1>
pe=00002 FAIL at line=00322    MAPL_Cap.F90                             <status=1>
pe=00002 FAIL at line=00198    MAPL_Cap.F90                             <status=1>
pe=00002 FAIL at line=00157    MAPL_Cap.F90                             <status=1>
pe=00002 FAIL at line=00131    MAPL_Cap.F90                             <status=1>
pe=00002 FAIL at line=00029    GCHPctm.F90                              <status=1>
pe=00003 FAIL at line=01064    MAPL_HistoryGridComp.F90                 <needs informative message>
pe=00003 FAIL at line=01829    MAPL_Generic.F90                         <needs informative message>
pe=00003 FAIL at line=00614    MAPL_CapGridComp.F90                     <status=1>
pe=00003 FAIL at line=00559    MAPL_CapGridComp.F90                     <status=1>
pe=00003 FAIL at line=00849    MAPL_CapGridComp.F90                     <status=1>
pe=00003 FAIL at line=00322    MAPL_Cap.F90                             <status=1>
pe=00003 FAIL at line=00198    MAPL_Cap.F90                             <status=1>
pe=00003 FAIL at line=00157    MAPL_Cap.F90                             <status=1>
pe=00003 FAIL at line=00131    MAPL_Cap.F90                             <status=1>
pe=00003 FAIL at line=00029    GCHPctm.F90                              <status=1>
pe=00005 FAIL at line=01064    MAPL_HistoryGridComp.F90                 <needs informative message>
pe=00005 FAIL at line=01829    MAPL_Generic.F90                         <needs informative message>
pe=00005 FAIL at line=00614    MAPL_CapGridComp.F90                     <status=1>
pe=00005 FAIL at line=00559    MAPL_CapGridComp.F90                     <status=1>
pe=00005 FAIL at line=00849    MAPL_CapGridComp.F90                     <status=1>
pe=00005 FAIL at line=00322    MAPL_Cap.F90                             <status=1>
pe=00005 FAIL at line=00198    MAPL_Cap.F90                             <status=1>

...

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 5 in communicator MPI_COMM_WORLD
with errorcode 262146.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
In: PMI_Abort(262146, N/A)
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 21 in communicator MPI_COMM_WORLD
with errorcode 262146.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
In: PMI_Abort(262146, N/A)
--------------------------------------------------------------------------
...
see more in the log file





Required information:

Your GCHP version and runtime environment:

  • GCHPctm version (can be last commit hash): __ dev.gchp_13.0.0
  • MPI type and version: __
  • Fortran cmpiler type and version: __ifort 18
  • netCDF version: __
  • Are you using GCHP "out of the box" (i.e. unmodified): __ I added full column diagnostics for RRTMG, but these work if I don't try regridding the output
    • If you have modified GCHP, please list what was changed: __ see above

Input and log files to attach

  • runConfig.sh: __
  • input.geos: __
  • HEMCO_Config.rc: __
  • ExtData.rc: __
  • HISTORY.rc: __
  • GCHP compile log file: __
  • GCHP run log file: __
  • HEMCO.log: __
  • slurm.out or any other error messages from your scheduler: __
  • Any other error messages: __

see here on Cannon for all the above files: /n/holyscratch01/jacob_lab/jmoch/geoE_rdirs/GCHP_13.0.0_geoE_off_vtest2
the log file relevant is: slurm-6797176.out

Additional context

gchp.log

@jmoch1214 jmoch1214 added the category: Bug Something isn't working label Nov 6, 2020
@lizziel
Copy link
Contributor

lizziel commented Nov 6, 2020

Hi @jmoch1214, here is the HISTORY.rc file you are using in case anyone following this issue wants to see it:
HISTORY.rc.txt

You are trying to use the PC144x91-DC lat-lon grid for RRTMG diagnostic output:

  RRTMG.template:   '%y4%m2%d2_%h2%n2z.nc4',
  RRTMG.format:     'CFIO',
  RRTMG.frequency:  010000
  RRTMG.duration:   010000
  RRTMG.mode:       'time-averaged'
  RRTMG.grid_label:    PC144x91-DC
  RRTMG.conservative:  1

This grid is one of the example grids defined at the top of the file:

    # Example of lat-lon global grid at 1x1 resolution
    PC144x91-DC.GRID_TYPE: LatLon
    PC144x91-DC.IM_WORLD: 144
    PC144x91-DC.JM_WORLD: 91
    PC144x91-DC.POLE: PC
    PC144x91-DC.DATELINE: DC
    PC144x91-DC.LM: 72

Using this grid, however, is not going to work because that grid is not listed in the GRID_LABELS list which tells MAPL which grids to understand:

GRID_LABELS: PE24x144-CF
             PC360x181-DC
             REGIONAL1x1
    ::

The GRID_LABELS list is similar to the diagnostic collections list in that anything excluded or commented out within it will simply be ignored elsewhere in the file. I think adding PC144x91-DC to the list will make the issue go away.

I will update the template to either remove that example or add it to the grid label list so other users don't make this mistake as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants