Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

landice_SHMIP_hewitt test failing (failed comparisons) #988

Closed
ikalash opened this issue Sep 30, 2023 · 8 comments
Closed

landice_SHMIP_hewitt test failing (failed comparisons) #988

ikalash opened this issue Sep 30, 2023 · 8 comments
Labels
LandIce Testing Stuff related to testing Albany (including nightly tests)

Comments

@ikalash
Copy link
Collaborator

ikalash commented Sep 30, 2023

The landIce_SHMIP_hewitt test is failing due to failed comparisons:

https://sems-cdash-son.sandia.gov/cdash/test/3736650

Last time it ran correctly was 9/25. @mperego , @bartgol : can one of you please have a look?

@ikalash ikalash added LandIce Testing Stuff related to testing Albany (including nightly tests) labels Sep 30, 2023
@bartgol
Copy link
Collaborator

bartgol commented Oct 2, 2023

It appears the linear solvers never converge, causing NOX to fail to converge for the first LOCA iteration. The only changes affecting attaway since the last pass were my Trilinos PR and Irina's changes to attaway build scripts. The latter seems correct, so I don't think they can be related to the failures. I went through the changes in the "find trilinos" PR, and I don't see anything wrong. Albany does build and run fine outside that test (and few others), so it's unlikely that something is utterly wrong.

I'm going to test with Sept 26th trilinos, to rule out changes in trilinos.

@ikalash
Copy link
Collaborator Author

ikalash commented Oct 2, 2023

Thanks for looking @bartgol . I have been using bisection to try to figure out what broke this and the Tempus tests. I suspect it's Trilinos. I've build Albany/Trilinos from 9/26, 9/27 and 9/28, and those are good so far. Your similar test would be good as a sanity check. I will report what I learn as I keep bisecting.

@bartgol
Copy link
Collaborator

bartgol commented Oct 2, 2023

Thanks for looking @bartgol . I have been using bisection to try to figure out what broke this and the Tempus tests. I suspect it's Trilinos. I've build Albany/Trilinos from 9/26, 9/27 and 9/28, and those are good so far. Your similar test would be good as a sanity check. I will report what I learn as I keep bisecting.

When you say that 9/26-28 are good, are you referring just to the Tempus fail? Or also to this one? If you checked both, I'm not gonna bother, but if you only did that for the Tempus fail, I will (Trilinos still building)

@ikalash
Copy link
Collaborator Author

ikalash commented Oct 2, 2023

I checked both.

@bartgol
Copy link
Collaborator

bartgol commented Oct 2, 2023

Oh, ok, then it's prob a pointless exercise to do it on my end as well (also, mappy just got slammed by a round of testing, so building almost slowed to halt).

@ikalash
Copy link
Collaborator Author

ikalash commented Oct 2, 2023

I agree. 9/29 at 12:00 is good too. Building 9/30 now. The issue must have happened sometime on 9/29.

@ikalash
Copy link
Collaborator Author

ikalash commented Oct 3, 2023

Ok, the mystery has been solved. I found the problematic commit in Trilinos and it is a change to the linear solvers. Please see the messages under issue #989 for more details.

@ikalash
Copy link
Collaborator Author

ikalash commented Oct 4, 2023

trilinos/Trilinos#12356 has fixed this problem. Thanks again, @cgcgcg!

@ikalash ikalash closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LandIce Testing Stuff related to testing Albany (including nightly tests)
Projects
None yet
Development

No branches or pull requests

2 participants