Join GitHub today
Make the TfJob controller more event driven #314
Right now the controller relies on the TrainingJob.reconcile being called frequently to check the state of the job and take any needed action.
In #308 it was suggested we adopt a more event driven design.
Here's the comment from @ScorpioCPH
I think its more complicated than that since we create other resources (e.g. services, config maps, etc...).
Its also not clear to me why the queue would get filled up since the number of items in the queue would be the same as number of jobs in the cluster.