Skip to content
This repository has been archived by the owner on Sep 23, 2019. It is now read-only.

Run crash when sending runoff field #4

Closed
aidanheerdegen opened this issue Nov 9, 2017 · 4 comments
Closed

Run crash when sending runoff field #4

aidanheerdegen opened this issue Nov 9, 2017 · 4 comments

Comments

@aidanheerdegen
Copy link
Contributor

Kial's new 1 degree run with an updated 50 level vertical grid is crashing just after initialisation.

Getting a segfault here:

matm_jra55_b1f482  00000000004ABD5F  mod_oasis_sys_mp_          40  mod_oasis_sys.F90
matm_jra55_b1f482  00000000004B73CF  mod_oasis_advance         531  mod_oasis_advance.F90
matm_jra55_b1f482  0000000000445A42  mod_oasis_getput_         535  mod_oasis_getput_interface.F90
matm_jra55_b1f482  00000000004125DF  cpl_interfaces_mp         375  cpl_interfaces.f90
matm_jra55_b1f482  000000000042C2DB  MAIN__                    464  matm.f90

This is here:

call prism_put_proto(il_var_id_out(6), istep1, remapped_runoff, ierror)

I have regridded the ocean_temp_salt.nc to his new grid, but are there other restarts we need to update? The ice is on it's own vertical grid, and the coupling fields are 2D, so don't "see" the change in ocean vertical grid .. right?

@aidanheerdegen
Copy link
Contributor Author

Mean to tag @nicjhan in above

@aidanheerdegen aidanheerdegen changed the title Run crash sending runoff field Run crash when sending runoff field Nov 9, 2017
@russfiedler
Copy link

@aidanheerdegen Are you sure that's a segfault? Looks like a standard abort and stack trace by oasis from line 531 in mod_oasis_advance.F90. There should have been an error message 'ERROR model timestep does not match coupling timestep'. Probably a mismatched initial/restart time.

@aidanheerdegen
Copy link
Contributor Author

Thanks Russ, but it definitely says segfault:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
matm_jra55_b1f482  00000000006BADD1  Unknown               Unknown  Unknown
matm_jra55_b1f482  00000000006B8F0B  Unknown               Unknown  Unknown
matm_jra55_b1f482  000000000066A864  Unknown               Unknown  Unknown
matm_jra55_b1f482  000000000066A676  Unknown               Unknown  Unknown
matm_jra55_b1f482  00000000005F91D9  Unknown               Unknown  Unknown
matm_jra55_b1f482  0000000000603206  Unknown               Unknown  Unknown
libpthread-2.12.s  00002B081FA857E0  Unknown               Unknown  Unknown
matm_jra55_b1f482  00000000005FA851  Unknown               Unknown  Unknown
matm_jra55_b1f482  00000000004ABD5F  mod_oasis_sys_mp_          40  mod_oasis_sys.F90
matm_jra55_b1f482  00000000004B73CF  mod_oasis_advance         531  mod_oasis_advance.F90
matm_jra55_b1f482  0000000000445A42  mod_oasis_getput_         535  mod_oasis_getput_interface.F90
matm_jra55_b1f482  00000000004125DF  cpl_interfaces_mp         375  cpl_interfaces.f90
matm_jra55_b1f482  000000000042C2DB  MAIN__                    464  matm.f90
matm_jra55_b1f482  000000000040BBDE  Unknown               Unknown  Unknown
libc-2.12.so       00002B081FEB5D1D  __libc_start_main     Unknown  Unknown
matm_jra55_b1f482  000000000040BAE9  Unknown               Unknown  Unknown

and there is no message about coupling time steps

Kial's run is here:

/home/157/kxs157/payu/1deg_jra55_ryf9091_kds50

@aidanheerdegen
Copy link
Contributor Author

Ok, scratch that. Found this:

 oasis_init_comp OPEN debug file for root pe, unit :        1026
 oasis_io_read_avfile:av2_swfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av3_swfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av4_swfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av5_swfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av2_lwfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av3_lwfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av4_lwfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av5_lwfld_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av2_rain_ai:NetCDF: Variable not found
 oasis_io_read_avfile model :           2  proc :           0
 oasis_io_read_avfile:av3_rain_ai:NetCDF: Variable not found
 oasis_advance_run  ERROR coupling skipped at earlier time,  potential deadlock 
 oasis_advance_run  my coupler =            1  variable = swfld_ai
 oasis_advance_run  current time =         9000  mseclag =        12600
 oasis_advance_run  skipped coupler =            1  variable = swfld_ai
 oasis_advance_run  skipped coupler last time and dt =        -3600       10800
 oasis_advance_run  ERROR model timestep does not match coupling timestep
 oasis_advance_run  abort by model :           2  proc :           0

in MATM debug.root. I'll check it out.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants