Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluids: Set KSP max iterations to 1000 #1256

Closed
wants to merge 1 commit into from

Conversation

jrwrigh
Copy link
Collaborator

@jrwrigh jrwrigh commented Jul 13, 2023

No description provided.

@jeremylt
Copy link
Member

Seems fine to me. Do we want a lower default for CI?

KSP ksp;
PetscCall(TSGetSNES(*ts, &snes));
PetscCall(SNESGetKSP(snes, &ksp));
PetscCall(KSPSetTolerances(ksp, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT, 1000));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale here? To waste less time in a "bad" configuration? Should we also use PCSetType(pc, PCVPBJACOBI) or keep the default bjacobi/ilu, which is better for some problems on CPU?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To waste less time in a "bad" configuration?

Yes, exactly.

Should we also use PCSetType(pc, PCVPBJACOBI) or keep the default bjacobi/ilu, which is better for some problems on CPU?

I'd say keep the same default settings for right now, with default being the """CPU optimal""". If we do want to switch to have VPBJACOBI, I feel like we should switch over to the -amat_type shell -pmat_pbdiagonal as defaults to go along with it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could set default PC, -amat_type, etc. based on the ceed resource? Might be more confusing, but would be more convenient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, and I take it needing more than 1000 iterations should continue to be interpreted as failure, not recovered by continuing Newton (or shortening a time step)? I think the standard blasius setup with vpbjacobi needed more than 1000 with gmres (but not lgmres).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be user-friendly, but also makes debugging by switching back to the host harder. If we're going to change defaults, I'd rather make them the same for all backends.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it needing more than 1000 iterations should continue to be interpreted as failure, not recovered by continuing Newton (or shortening a time step)?

Yes, though perhaps we should wait for @KennethEJansen to confirm.

I think the standard blasius setup with vpbjacobi needed more than 1000 with gmres (but not lgmres).

Yeah, you're right. I'm getting around 2000 iterations for that configuration on CPU. But for production runs, we wouldn't be running that kind of problem. IIRC, PHASTA maxes out at 200 iterations by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to modify the blasius setup so it's not so stiff.

PHASTA maxes out at 200 iterations by default.

... and then it aborts the run or it just continues with the next Newton step? The desired semantics for "this problem is tough" are different from "maybe GPU-aware MPI is broken and we should stop wasting time on it".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It continues to the next Newton step. I was using it more in the "if we're ok with moving forward at 200, then at 1000 something must be wrong."

@jrwrigh
Copy link
Collaborator Author

jrwrigh commented Jul 13, 2023

Seems fine to me. Do we want a lower default for CI?

Nah, I think CI is fine as is. Plus the tests shouldn't approach 1000 KSP iterations (though I'll wait for Noether testing to finish to truly verify that).

@jrwrigh jrwrigh closed this Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants