Please sign in to comment.
revert cache to original state on evict and bind errors
This change ensures that node, task and job info will remain unchanged in case of an error during SchedulerCache.Bind and SchedulerCache.Evict calls. Before if error occurred during binding phase (e.g. "Selected node NotReady") task could get stuck in the Binding status indefinitely while the real pod would be in the Pending status. - SchedulerCache.Evict and SchedulerCache.Bind will revert task status on NodeInfo.UpdateTask and NodeInfo.AddTask errors. - Modified behavior of NodeInfo.AddTask. AddTask will now update task's node name upon successful addition, this is similar to how JobInfo.UpdateTaskStatus updates task's status. - Handling unchecked error in JobInfo.UpdateTaskStatus. - FATAL logging in NodeInfo.UpdateTask when impossible situation happens - failing to add a task after removal of a task from node info. Might be related to #891
- Loading branch information
Showing with 59 additions and 25 deletions.