Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For gcp12, update eam ne4 pelayout #6487

Merged
merged 1 commit into from
Jul 9, 2024

Conversation

ndkeen
Copy link
Contributor

@ndkeen ndkeen commented Jun 26, 2024

For gcp12, update ne4 eam pelayout to use 96 tasks which needs 2 nodes (previously 1 node).
This is not node-efficient, but can hopefully improve overall testing time as extra_coverage always waits
for SMS_Ly1.ne4pg2_oQU480.F2010.gcp12_gnu

Tested with e3sm_integration.

As noted in next comment, PEM_D_Ln5.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress will fail as it looks like this test is not BFB when changing number of tasks.
Otherwise, BFB except NML diff for change in pelayout

@ndkeen ndkeen self-assigned this Jun 26, 2024
@ndkeen ndkeen added Machine Files GCP google cloud platform labels Jun 26, 2024
Copy link

PR Preview Action v1.4.7
🚀 Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6487/
on branch gh-pages at 2024-06-26 00:25 UTC

@ndkeen
Copy link
Contributor Author

ndkeen commented Jun 27, 2024

I now see that the eam-implicit_stress SMS test in extra_coverage will fail compare with baseline if I change the pelayout. This prompted me to check:
PEM_D_Ln5.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress
with current next (ie without my change to pelayout) and it does fail.

PEM_Ln5.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress also fails

ERS_D.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress passes

and
ERP_D.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress fails as it also changes NTASKS

@ndkeen ndkeen requested a review from rljacob June 28, 2024 04:12
@ndkeen
Copy link
Contributor Author

ndkeen commented Jul 3, 2024

I created #6498

ndkeen added a commit that referenced this pull request Jul 8, 2024
#6487)

For gcp12, update ne4 eam pelayout to use 96 tasks which needs 2 nodes (previously 1 node).
This is not node-efficient, but can hopefully improve overall testing time as extra_coverage always waits
for SMS_Ly1.ne4pg2_oQU480.F2010.gcp12_gnu

Tested with e3sm_integration.

As noted in next comment, PEM_D_Ln5.ne4pg2_oQU480.F2010.gcp12_gnu.eam-implicit_stress will fail as it looks like this test is not BFB when changing number of tasks.
Otherwise, BFB except NML diff for change in pelayout
@ndkeen
Copy link
Contributor Author

ndkeen commented Jul 9, 2024

Merged this to next and the results of extra coverage are expected. Several NML diffs due to the pelayout change. And the baseline diff is because of the test not behaving well with different MPI's. Will bless after merging to master.

@ndkeen ndkeen merged commit cb73e3a into master Jul 9, 2024
21 checks passed
@ndkeen ndkeen deleted the ndk/machinefiles/gcp12-adjust-ne4-pelayout branch July 9, 2024 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GCP google cloud platform Machine Files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants