-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash detection #69
Crash detection #69
Conversation
Need to improve the crash detection: |
src/Mainloop.cu
Outdated
// Stop calculation if crashed | ||
if (XLoop.dt < XParam.dtmin) | ||
{ | ||
XLoop.totaltime = XParam.endtime + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would make more sense to say :
XParam.endtime = XLoop.totaltime
with the same outcome.
i.e. when looking at the model status after the crash it will look like the model reached the expected endtime when in reality it stopped earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comments, I will make these changes
src/Mainloop.cu
Outdated
if (XLoop.dt < XParam.dtmin) | ||
{ | ||
XLoop.totaltime = XParam.endtime + 1; | ||
log(" \n "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add :
XLoop.nextoutputtime = XLoop.totaltime
this will enforce the request for output without changing the if statement below. same effect but cleaner I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this too.
Few things with this solution ATM:
On one end the On the other end Possible solutionsSome crash do not affect the model timestep/minimum timestep (e.g. +INF) but we can probably look at the largest time step as well as the smallest to find crazy values. Implementing that technique would move the crash detection inside the timestep calculation but that is a more complex operation and more code changes. Should that detection be moved to post v1.0? Do you agree?I suggest we do:
|
This development is useful to detect the code crashes, locate the source and understand the origin of the crash. |
Creation of a Crash detection to stop the run if the time-step collapses.