Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the x case #397

Merged
merged 6 commits into from
Aug 2, 2023
Merged

fix the x case #397

merged 6 commits into from
Aug 2, 2023

Conversation

jedwards4b
Copy link
Collaborator

Description of changes

Fixes the x case by ignoring the rof->ocn coupling in med_internalstate_mod.F90

Specific notes

Contributors other than yourself, if any:

CMEPS Issues Fixed #396

Are changes expected to change answers? (specify if bfb, different at roundoff, more substantial)

Any User Interface Changes (namelist or namelist defaults changes)?

Testing performed

Please describe the tests along with the target model and machine(s)
If possible, please also added hashes that were used in the testing

@jedwards4b jedwards4b requested a review from mvertens July 28, 2023 15:45
@jedwards4b jedwards4b self-assigned this Jul 28, 2023
@mvertens
Copy link
Collaborator

@jedwards4b - I got past the problem you encountered by doing the following:

diff --git a/mediator/esmFldsExchange_cesm_mod.F90 b/mediator/esmFldsExchange_cesm_mod.F90
index 66dc57c..b8fba33 100644
--- a/mediator/esmFldsExchange_cesm_mod.F90
+++ b/mediator/esmFldsExchange_cesm_mod.F90
@@ -2158,7 +2158,7 @@ contains
           ! liquid from river and possibly flood from river to ocean
           if (fldchk(is_local%wrap%FBImp(comprof, comprof), 'Forr_rofl' , rc=rc)) then
              if (trim(rof2ocn_liq_rmap) == 'unset') then
-                call addmap_from(comprof, 'Forr_rofl', compocn, mapconsd, 'none', 'unset')
+                call addmap_from(comprof, 'Forr_rofl', compocn, mapconsd, 'one', 'unset')
              else
                 call addmap_from(comprof, 'Forr_rofl', compocn, map_rof2ocn_liq, 'none', rof2ocn_liq_rmap)
              end if
@@ -2182,7 +2182,7 @@ contains
           ! ice from river to ocean
           if (fldchk(is_local%wrap%FBImp(comprof, comprof), 'Forr_rofi' , rc=rc)) then
              if (trim(rof2ocn_ice_rmap) == 'unset') then
-                call addmap_from(comprof, 'Forr_rofi', compocn, mapconsd, 'none', 'unset')
+                call addmap_from(comprof, 'Forr_rofi', compocn, mapconsd, 'one', 'unset')
              else
                 call addmap_from(comprof, 'Forr_rofi', compocn, map_rof2ocn_ice, 'none', rof2ocn_ice_rmap)
              end if

It ran successfully for over 3 days - but then crash in trying to map rof to lnd.

Model Date: 0001-01-02T00:00:00 wall clock = 2023-07-31T10:08:14 avg dt =     1.52 s/day, dt =     1.52 s/day, rate =   155.70 ypd
 memory_write: model date = 0001-01-02T00:00:00 memory =      -0.00 MB (highwater)        234.61 MB (usage)
Model Date: 0001-01-03T00:00:00 wall clock = 2023-07-31T10:08:15 avg dt =     1.57 s/day, dt =     1.63 s/day, rate =   145.41 ypd
 memory_write: model date = 0001-01-03T00:00:00 memory =      -0.00 MB (highwater)        236.16 MB (usage)
Model Date: 0001-01-04T00:00:00 wall clock = 2023-07-31T10:08:17 avg dt =     1.56 s/day, dt =     1.54 s/day, rate =   153.38 ypd
 memory_write: model date = 0001-01-04T00:00:00 memory =      -0.00 MB (highwater)        240.48 MB (usage)
(E
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=690029.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: b5202: task 0: Out Of Memory
srun: launch/slurm: _step_signal: Terminating StepId=690029.0
slurmstepd: error: *** STEP 690029.0 ON b5202 CANCELLED AT 2023-07-31T10:08:27 ***
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source
....
cesm.exe           00000000004D895C  med_map_mod_mp_me        1337  med_map_mod.F90
cesm.exe           00000000004D51E1  med_map_mod_mp_me        1064  med_map_mod.F90
cesm.exe           000000000053F575  med_phases_post_r          56  med_phases_post_rof_mod.F90
.....

Here

    ! map rof to lnd
    if (is_local%wrap%med_coupling_active(comprof,complnd)) then
       call t_startf('MED:'//trim(subname)//' map_rof2lnd')
       call med_map_field_packed( &
            FBSrc=is_local%wrap%FBImp(comprof,comprof), &
            FBDst=is_local%wrap%FBImp(comprof,complnd), &
            FBFracSrc=is_local%wrap%FBFrac(comprof), &
            field_normOne=is_local%wrap%field_normOne(comprof,complnd,:), &
            packed_data=is_local%wrap%packed_data(comprof,complnd,:), &
            routehandles=is_local%wrap%RH(comprof,complnd,:), rc=rc)
       if (ChkErr(rc,__LINE__,u_FILE_u)) return
       call t_stopf('MED:'//trim(subname)//' map_rof2lnd')
    end if

Frankly - I am wondering why we are worrying about rof in an X compset. This is such a special case for mapping - since with either drof or any of the prognostic river components we use custom mapping files.

@billsacks
Copy link
Member

Also recall the discussion in #334 where we discussed problems with the current attempts to create runoff mappings at runtime, and particularly the summary comment in that issue.

@jedwards4b
Copy link
Collaborator Author

My feeling is that we either get the X case working or completely remove it from cmeps.

@jedwards4b
Copy link
Collaborator Author

I tried the solution suggested by Mariana - it ran 5 days without issue.

Copy link
Collaborator

@mvertens mvertens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its interesting that you had no problem running this for 5 days. I am comfortable with this change.

mediator/med_map_mod.F90 Outdated Show resolved Hide resolved
@jedwards4b jedwards4b merged commit 65770e1 into ESCOMP:main Aug 2, 2023
2 checks passed
@jedwards4b jedwards4b deleted the xcase_fix branch August 2, 2023 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Case SMS.f10_f10_mg37.X fails
3 participants