-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create the provis_state
subpool at RK4 initialization to avoid memory leak
#6334
Conversation
- Avoids create/destroy at each timestep
@matthewhoffman and @akturner, I thought I'd tag you on this since it touches the framework, even though this is apparently the first usage of |
great, thanks @sbrus89 |
@erinethomas and @darincomeau, would you mind taking a look at the framework changes in this PR just in case you see any issues from the sea ice side? |
Repeating post from E3SM-Ocean-Discussion#87 (comment) Passes nightly test suite and compares bfb with master branch point on chicoma with optimized gnu and chrysalis with optimized intel. Also passes nighty test suite with debug gnu on chicoma. Note this includes a series of RK4 tests:
In E3SM, passes
|
Repeating post from E3SM-Ocean-Discussion#87 (comment). Note these tests all do not use LTS. Passes nightly test suite and compares bfb with master branch point on chicoma with optimized gnu and chrysalis with optimized intel. Also passes nighty test suite with debug gnu on chicoma. Note this includes a series of RK4 tests:
In E3SM, passes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with LTS on using gnu on chicoma and intel on chryslis, both debug and optimized. The following tests compare bfb against the master branchpoint on both.
ocean/dam_break/40cm/default_lts
* step: initial_state
* step: lts_regions
* step: forward
* step: viz
test execution: SUCCESS
baseline comparison: PASS
test runtime: 05:11
ocean/dam_break/120cm/default_lts
* step: initial_state
* step: lts_regions
* step: forward
* step: viz
test execution: SUCCESS
baseline comparison: PASS
test runtime: 01:00
ocean/dam_break/120cm/ramp_lts
* step: initial_state
* step: lts_regions
* step: forward
* step: viz
test execution: SUCCESS
baseline comparison: PASS
test runtime: 01:04
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above tests show that we get the identical answer to master with standard time stepping and LTS. In addition, tests by @gcapodag show that this addition solves an out-of-memory error on perlmutter that he and @jeremy-lilly have been struggling with for the past two months. Thank you @sbrus89!
@xylar and @matthewhoffman these changes are only used by the ocean core, and only for RK4 and LTS timestepping. I think you are safe with a visual review, as I tried to be thorough in my testing of RK4 and LTS and saw BFB results.
|
in line with @mark-petersen's comment above - I was not able to see any place where this would impact MPAS-SI this morning ... so I think its ok on the sea-ice side... |
@mark-petersen, the mpas-ocean
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked over the code and discussed improvements with @sbrus89, which he implemented. I'm happy with the changes now. I agree that the framework changes only affect MPAS-Ocean and only in configurations that are not used in E3SM production simulations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbrus89 , I'm not very familiar with this code, so what you've done to solve the problem makes sense to me, and I can't review it very critically. I have two small items, but they are not major concerns.
Create the provis_state subpool at RK4 initialization to avoid memory leak This PR fixes a memory leak in the RK4 timestepping when running 125 day single-layer barotropic tides cases with the vr45to5 mesh on pm-cpu. Previously, it could only get through about 42 days of simulation before running out of memory. This issue is related to creating/destroying the provis_state subpool at each timestep. Since RK4 is not used in E3SM, this PR is B4B for all E3SM tests. The mpas_pool_copy_pool routine modified here is not used in MPAS-Seaice or MALI. [BFB]
passes:
merged to next |
merged to master |
This merge updates the E3SM-Project submodule from [93e511d](https://github.com/E3SM-Project/E3SM/tree/93e511d) to [31e0924](https://github.com/E3SM-Project/E3SM/tree/31e0924). This update includes the following MPAS-Ocean and MPAS-Frameworks PRs (check mark indicates bit-for-bit with previous PR in the list): - [ ] (ocn) E3SM-Project/E3SM#6256 - [ ] (ocn) E3SM-Project/E3SM#6224 - [ ] (ocn) E3SM-Project/E3SM#6270 - [ ] (ocn) E3SM-Project/E3SM#6293 - [ ] (ocn) E3SM-Project/E3SM#6321 - [ ] (ocn) E3SM-Project/E3SM#6262 - [ ] (ocn) E3SM-Project/E3SM#6300 - [ ] (ocn) E3SM-Project/E3SM#6334 - [ ] (ocn) E3SM-Project/E3SM#6371 - [ ] (ocn) E3SM-Project/E3SM#6288
This PR fixes a memory leak I noticed in the RK4 timestepping when running 125 day single-layer barotropic tides cases with the vr45to5 mesh (MPAS-Dev/compass#802) on pm-cpu. Previously, I could only get through about 42 days of simulation before running out of memory.
This issue is related to creating/destroying the
provis_state
subpool at each timestep. We had a similar problem a few years back that required memory leaks fixes in thempas_pool_destroy_pool
(MPAS-Dev/MPAS-Model#367) subroutine. However, I believe there is still a memory leak in thempas_pool_remove_subpool
routine (which callspool_remove_member
) that is called followingmpas_pool_destroy_pool
. It seems like the TODO comment here:E3SM/components/mpas-framework/src/framework/mpas_pool_routines.F
Lines 6036 to 6038 in 6b9ecaa
I'm not familiar enough with the low-level details of the pools framework to track down the memory leak itself. However, in any case, I think it makes more sense to create the
provis_state
subpool once at initialization as opposed to creating and destroying it every timestep. The main consequence of this approach is that thempas_pool_copy_pool
subroutine needs to have aoverrideTimeLevel
option similar to that used inmpas_pool_clone_pool
used previously.I've tested these changes with the vr45to5 tides test case and they do allow me to run for a full 125 days and are B4B with the previously create/destroy approach. The LTS and FB-LTS timestepping schemes are also updated here in the same way to prevent the same memory leak.
Since RK4 is not used in E3SM, this PR is B4B for all E3SM tests. The
mpas_pool_copy_pool
routine modified here is not used in MPAS-Seaice or MALI. See the Ocean-Discussions PR for additional background and testing.[BFB]