-
Notifications
You must be signed in to change notification settings - Fork 173
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Gate statemachine progress on taskmanager availability (#144)
Currently, the only condition for moving from ClusterStarting to Savepointing (at which point the job goes down) is that all JM/TM pods are up according to the deployment. However, various issues with the TM process or configuration can prevent them from actually registering with the JobManager and becoming available to run tasks. This can lead to extended downtime and can require manual intervention to fix. I've also added a check from the SubmittingJob -> Running transition that the tasks are actually running, which gives us a chance to automatically roll back if the job never successfully starts. This PR also adds some more visibility into task-level status, so that users can tell if the job is really running (in Flink, a job can be in the Running state even if none of its tasks are running). I've added two new fields to the JobStatus (TotalTasks and RunningTasks) and updated the JobHealth logic to take into account whether tasks are actually running.
- Loading branch information
Showing
8 changed files
with
94 additions
and
66 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters