Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify mpas framework to match mpasa in EW v2.1 #6

Merged
merged 14 commits into from
Feb 20, 2024

Conversation

dazlich
Copy link
Contributor

@dazlich dazlich commented Feb 19, 2024

EW v2.1 broke the coupled model because of inconsistencies between the mpas framework in the atmosphere and in the ocean.

This PR restores the consistency between the two frameworks.

This file is reverted to its previous version to undo mistaken match to mpasa framework in last commit.
Removed to match the new v2.1 mpasa framework.
Removed to match the new v2.1 mpasa framework.
Removed to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
Modified to match the new v2.1 mpasa framework.
@dazlich
Copy link
Contributor Author

dazlich commented Feb 19, 2024

@gdicker1 - what's the tag that will be assigned to this? And do we want this to get into the main branch as well?

@gdicker1
Copy link
Contributor

I'm thinking this change (once merged) will be mpasfrwk-ew2.1.001. I agree this should be merged onto main ASAP maybe as a patch release for EarthWorks overall. I'm doing my tests now to approve.

Copy link
Contributor

@gdicker1 gdicker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now though, shouldn't this PR go to the ew-develop branch? We can merge or fast-forward the change onto ew-main pretty easily.

@dazlich
Copy link
Contributor Author

dazlich commented Feb 20, 2024 via email

@dazlich
Copy link
Contributor Author

dazlich commented Feb 20, 2024

I'm stumped figuring out how to change this from ew-main to ew-develop

@gdicker1 gdicker1 changed the base branch from ew-main to ew-develop February 20, 2024 20:53
@gdicker1
Copy link
Contributor

I'm stumped figuring out how to change this from ew-main to ew-develop

I can change it then!

@gdicker1
Copy link
Contributor

FYI: I can do this by clicking the "edit" button next to the PR title, I then see the text below the title change from "... merge 14 commits to ew-main ..." to "... merge 14 commits to base:ew-main ...". I can click on base:ew-main to get a drop down menu and switch to ew-develop. Then a "change base" pop-up appears; clicking the green button in the pop-up finishes the base change.

Changing the other branch is a different deal! (Basically create a new PR or do some rebasing+force-pushing of that branch)

@dazlich
Copy link
Contributor Author

dazlich commented Feb 20, 2024 via email

@gdicker1 gdicker1 self-requested a review February 20, 2024 21:09
@gdicker1 gdicker1 self-requested a review February 20, 2024 21:15
@dazlich dazlich merged commit 2de4b39 into ew-develop Feb 20, 2024
Copy link
Contributor

@gdicker1 gdicker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this allows things to work on the CPU side, but breaks specifically GPU FullyCoupled builds still fail.


From "/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/cesm.bldlog.240220-113659"

nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_3d_real_acc_8350_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_3d_real_acc_8308_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_2d_real_acc_7433_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_2d_real_acc_7393_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_1d_real_acc_6519_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink error   : Multiple definition of 'mpas_dmpar_mpas_dmpar_exch_halo_1d_real_acc_6481_gpu' in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libocn.a:mpas_dmpar.o', first defined in '/glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/lib/libice.a:mpas_dmpar.o'
nvlink fatal   : merge_elf failed
pgacclnk: child process exit status 2: /glade/u/apps/common/23.04/spack/opt/spack/nvhpc/23.5/Linux_x86_64/23.5/compilers/bin/tools/nvdd
gmake: *** [/glade/work/gdicker/EarthWorks/EWRepo_PullRequests/2024Feb20_MPASfrwk6/cases/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/Tools/Makefile:978: /glade/derecho/scratch/gdicker/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc/bld/cesm.exe] Error 2

This was created with the following command:

> /glade/work/gdicker/EarthWorks/EWRepo_PullRequests/2024Feb20_MPASfrwk6/EarthWorks/cime/scripts/create_newcase --case /glade/work/gdicker/EarthWorks/EWRepo_PullRequests/2024Feb20_MPASfrwk6/cases/2024Feb20_113609_EWMTesting_FullyCoupled.mpasa120.derecho.nvhpc --project UCSU0085 --compiler nvhpc --res mpasa120_oQU120 --compset 2000_CAM60_CLM50%SP_MPASSI_MPASO_SROF_SGLC_SWAV --driver nuopc --run-unsupported -i /glade/campaign/univ/ucsu0085/inputdata --ngpus-per-node 4 --gpu-type a100 --gpu-offload openacc --pecount 128

Further record of 120km and 60km GPU FullyCoupled case creation/setup on Derecho: "/glade/work/gdicker/EarthWorks/EWRepo_PullRequests/2024Feb20_MPASfrwk6/EarthWorks/tools/Earthworks_scripts/EWv2_CreateBuildRun/log.setup.2024Feb20_113609_EWMTesting.FullyCoupled.txt"
Further record of attempt to build cases: "/glade/work/gdicker/EarthWorks/EWRepo_PullRequests/2024Feb20_MPASfrwk6/EarthWorks/tools/Earthworks_scripts/EWv2_CreateBuildRun/log.buildrun.2024Feb20_113609_EWMTesting.FullyCoupled.txt"

@gdicker1
Copy link
Contributor

gdicker1 commented Feb 20, 2024

Sorry @dazlich, this was a premature approval, when I was trying to hit other buttons. Mind if I hit the "revert"?

@dazlich
Copy link
Contributor Author

dazlich commented Feb 20, 2024 via email

@gdicker1
Copy link
Contributor

gdicker1 commented Feb 20, 2024

Welp now I'm stumped on reverting this without another PR. We'll keep these changes and I'll link the error in this comment back to an issue page, since it shows we still have GPU issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants