Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Disaster detection and disabled actions #1230
In a critical situation it is helpful to limit the amount of task churn in Singularity. This PR adds the ability for an admin to globally disable certain actions. So far it is implemented for
Singularity will respond with a 423 (locked) and the message given when disabling (or a default message)
In this PR I am also adding an automated way of disabling actions based on things such as task lag and the frequency of lost slaves or lost tasks.
TODO for this PR:
@tpetr Updated the PR a bit to take the time range over which events are occurring into account better. - task lag takes into account how long the calculated value has been over the specified threshold. A disaster will trigger if it has been over for a certain amount of time (45s default)