Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing Operator specific handling during a StudyJob run #387

Merged
merged 2 commits into from Feb 20, 2019

Conversation

@johnugeorge
Copy link
Member

commented Feb 17, 2019

During StudyJob run, there are no more operator specific code. The unstructured is used to create the resource, get the resource and get the status

The only remaining is the watch changes that has to be separately handled in #317. Current controller runtime code(0.1.7 and less) doesn't support dynamic watch. Refer: kubernetes-sigs/kubebuilder#422

Related: #341


This change is Reviewable

@k8s-ci-robot k8s-ci-robot requested review from jose5918 and libbyandhelen Feb 17, 2019

@k8s-ci-robot k8s-ci-robot added the size/L label Feb 17, 2019

@johnugeorge

This comment has been minimized.

Copy link
Member Author

commented Feb 19, 2019

var state katibapi.State = katibapi.State_RUNNING
var cpTime *metav1.Time
switch w.Kind {
switch wkind.Kind {

case DefaultJobWorker:

This comment has been minimized.

Copy link
@richardsliu

richardsliu Feb 19, 2019

Contributor

Is this case still needed?

This comment has been minimized.

Copy link
@johnugeorge

johnugeorge Feb 19, 2019

Author Member

Yes. But the string values are different currently. 'Complete' in https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/batch/v1/types.go#L170 vs 'Succeeded' in https://github.com/kubeflow/tf-operator/blob/master/pkg/apis/common/v1beta2/common_types.go#L121

In general, since 'job' is a built-in K8s Kind, should we care about it matching with our JobCondition? I feel that the only necessary condition is that all custom job operators share the same JobStatus type.

Or other option is, we can change the common JobConditionType string to match the batch Job's JobConditionType. But new developers might not be aware of this assumption and may break the logic.

This comment has been minimized.

Copy link
@richardsliu

richardsliu Feb 19, 2019

Contributor

I think this is ok for this version. One thing that we have discussed in the past is moving the common types into its own repo. If we go down that route, it will be easier to change then. What do you think?

This comment has been minimized.

Copy link
@johnugeorge

johnugeorge Feb 20, 2019

Author Member

Yup. There is a discussion in kubeflow/mpi-operator#92 (comment)

if err := r.Client.Get(context.TODO(), nname, &job); err != nil {
log.Printf("Client Get error %v for %v", err, nname)
return WorkerStatus{}
}
if job.Status.Active == 0 && job.Status.Succeeded > 0 {

This comment has been minimized.

Copy link
@richardsliu

richardsliu Feb 19, 2019

Contributor

It seems like we should be able to use the similar logic here to check for job completion: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/batch/v1/generated.proto#L158

This comment has been minimized.

Copy link
@johnugeorge

johnugeorge Feb 19, 2019

Author Member

Please see above comment

@richardsliu

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Feb 19, 2019

jobStatus := commonv1beta1.JobStatus{}
err := runtime.DefaultUnstructuredConverter.FromUnstructured(statusMap, &jobStatus)
if err != nil {
log.Printf("Error in converting unstructured to status:%v ", err)

This comment has been minimized.

Copy link
@hougangliu

hougangliu Feb 20, 2019

Member

s/status:%v/status: %v

This comment has been minimized.

Copy link
@hougangliu

hougangliu Feb 20, 2019

Member

return WorkerStatus{}

@k8s-ci-robot k8s-ci-robot removed the lgtm label Feb 20, 2019

@hougangliu

This comment has been minimized.

Copy link
Member

commented Feb 20, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Feb 20, 2019

@johnugeorge johnugeorge force-pushed the johnugeorge:generic branch from 3ccf056 to c9f87a4 Feb 20, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Feb 20, 2019

@hougangliu

This comment has been minimized.

Copy link
Member

commented Feb 20, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm label Feb 20, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link

commented Feb 20, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hougangliu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8a89b9e into kubeflow:master Feb 20, 2019

3 checks passed

cla/google All necessary CLAs are signed
kubeflow-katib-presubmit Job succeeded.
Details
tide In merge pool.
Details

@johnugeorge johnugeorge referenced this pull request Mar 4, 2019

Closed

Make Katib generic for operator support #341

1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.