feat: Support running although it is uesless #194

gaocegege · 2019-07-31T03:24:12Z

We are using common.CleanPodPolicy now in pytorchjob, and if users set the policy to running, we will delete all pods and services. I think the behaviour is not consistent with tfjob. In pytorchjob, maybe we could support running, which does not delete anything just like None, to be user-friendly.

WDYT @johnugeorge

One of our users does not know we do not support running, and use it. Then he found that the pods are deleted although they are not running

Signed-off-by: Ce Gao gaoce@caicloud.io

coveralls · 2019-07-31T03:46:43Z

Coverage remained the same at 85.281% when pulling 520a360 on gaocegege:running into e98eb3c on kubeflow:master.

gaocegege · 2019-08-05T08:14:52Z

/cc @johnugeorge

johnugeorge · 2019-08-05T11:00:46Z

@gaocegege Sorry.I missed it.
What does "Running" policy mean in case of pytorch which has master+worker? I think, it makes sense only with PS pods where it can be kept running or not.
The pytorch behavior was itself little inconsistent when I verified CleanPodPolicy as running in the earlier implementation phase(didn't try with pytorch 1.0). At the instant when job gets completed, I saw that some pods were in running while some in completed. This will cause these pods to get deleted while some remain. i felt, this to be confusing to the users. WDYT? Can you verify if you can?

As you are saying, it makes sense to have consistent behavior since PodPolicy is shared now. It doesn't hurt anyways to add them. Please share your thoughts

gaocegege · 2019-08-06T01:30:07Z

@johnugeorge

Thanks for your comments. I also think Running policy should be avoided in PytorchJob. But we have such case:

The users want to stop running pods in pytorch-operator as tf-operator does. And they set the policy to Running. But all pods are deleted since we treat Running as All.

Then how about:

func (pc *PyTorchController) deletePodsAndServices(job *pyv1.PyTorchJob, pods []*v1.Pod) error {
	if len(pods) == 0 {
		return nil
	}
	// Delete nothing when the cleanPodPolicy is None.
	if *job.Spec.CleanPodPolicy == common.CleanPodPolicyNone || *job.Spec.CleanPodPolicy == common.CleanPodPolicyRunning {
		return nil
	}

We do not delete all pods, we follow the same behavior as CleanPodPolicyNone

johnugeorge · 2019-08-06T03:35:06Z

Looks good to me. It would be better than deleting the pods which adds more confusion.

gaocegege · 2019-08-06T03:43:31Z

Yeah, I think so. I will update it

Signed-off-by: Ce Gao <gaoce@caicloud.io>

gaocegege · 2019-08-06T04:03:17Z

/assign @johnugeorge

johnugeorge · 2019-08-06T04:19:05Z

/lgtm

johnugeorge · 2019-08-06T04:19:14Z

/approve

k8s-ci-robot · 2019-08-06T04:19:17Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [johnugeorge]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from andreyvelich and johnugeorge July 31, 2019 03:24

k8s-ci-robot added the size/S label Jul 31, 2019

fix: Add running policy

520a360

Signed-off-by: Ce Gao <gaoce@caicloud.io>

gaocegege force-pushed the running branch from aaba39b to 520a360 Compare August 6, 2019 04:02

k8s-ci-robot assigned johnugeorge Aug 6, 2019

k8s-ci-robot added the lgtm label Aug 6, 2019

k8s-ci-robot added the approved label Aug 6, 2019

k8s-ci-robot merged commit ca2c9c4 into kubeflow:master Aug 6, 2019

gaocegege deleted the running branch October 30, 2019 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support running although it is uesless #194

feat: Support running although it is uesless #194

gaocegege commented Jul 31, 2019

coveralls commented Jul 31, 2019 •

edited

gaocegege commented Aug 5, 2019

johnugeorge commented Aug 5, 2019

gaocegege commented Aug 6, 2019 •

edited

johnugeorge commented Aug 6, 2019

gaocegege commented Aug 6, 2019

gaocegege commented Aug 6, 2019

johnugeorge commented Aug 6, 2019

johnugeorge commented Aug 6, 2019

k8s-ci-robot commented Aug 6, 2019

feat: Support running although it is uesless #194

feat: Support running although it is uesless #194

Conversation

gaocegege commented Jul 31, 2019

coveralls commented Jul 31, 2019 • edited

gaocegege commented Aug 5, 2019

johnugeorge commented Aug 5, 2019

gaocegege commented Aug 6, 2019 • edited

johnugeorge commented Aug 6, 2019

gaocegege commented Aug 6, 2019

gaocegege commented Aug 6, 2019

johnugeorge commented Aug 6, 2019

johnugeorge commented Aug 6, 2019

k8s-ci-robot commented Aug 6, 2019

coveralls commented Jul 31, 2019 •

edited

gaocegege commented Aug 6, 2019 •

edited