Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement KFServing v1beta1 controller #1042

Merged
merged 52 commits into from
Sep 21, 2020
Merged

Implement KFServing v1beta1 controller #1042

merged 52 commits into from
Sep 21, 2020

Conversation

yuzisun
Copy link
Member

@yuzisun yuzisun commented Aug 24, 2020

What this PR does / why we need it:
This is a follow up PR for controller implementation to previous v1beta API PR #991

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #904

Special notes for your reviewer:

  1. The generated v1beta1 crd is way too big and causes the kubectl apply taking long time to do diffs, for long term we need to use k8s 1.18's server side apply.
  2. Conversion webhooks are implemented between v1alpha2 and v1beta1.
  3. Conform controller-runtime webhook interface for both v1alpha2 and v1beta1
  4. Need to copy the podspec to make containers field optional

Release note:

- Conversion webhook is installed for auto conversion between v1alpha2 and v1beta1.
- Minimal kubernetes version is now 1.15 which supports conversion webhook

@kubeflow-bot
Copy link

This change is Reviewable

@yuzisun
Copy link
Member Author

yuzisun commented Aug 25, 2020

/retest

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work as always Dan!

hookServer.Register("/validate-inferenceservices", &webhook.Admission{Handler: &inferenceservice.Validator{}})
hookServer.Register("/mutate-inferenceservices", &webhook.Admission{Handler: &inferenceservice.Defaulter{}})

if err = ctrl.NewWebhookManagedBy(mgr).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to be fancy, you could wrap all this initialization logic in a for loop:

for resource, reconciler := range map[interface{}]Reconciler
} {
// register controller
// register webhook
}

logger.Info("Defaulting InferenceService", "namespace", isvc.Namespace, "name", isvc.Name)
client, err := client.New(config.GetConfigOrDie(), client.Options{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to me that we don't have a way to surface errors from this. How does kubebuilder expect defaulting logic to fail?

@@ -27,7 +27,7 @@ import (
// +kubebuilder:printcolumn:name="URL",type="string",JSONPath=".status.url"
// +kubebuilder:printcolumn:name="Ready",type="string",JSONPath=".status.conditions[?(@.type=='Ready')].status"
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
// +kubebuilder:resource:path=trainedmodel,shortName=tm
// +kubebuilder:resource:path=trainedmodels,shortName=tm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflecting on this, I kind of wish we just used the name "models". Trained or Servable is implied by nature of being part of the serving.kubeflow.org API group.

status.GetCondition(apis.ConditionReady).Status == v1.ConditionTrue
}

func (r *InferenceServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this logic up to main.go and generalizing for all controllers.

@yuzisun
Copy link
Member Author

yuzisun commented Sep 4, 2020

/retest

3 similar comments
@yuzliu
Copy link
Contributor

yuzliu commented Sep 5, 2020

/retest

@yuzisun
Copy link
Member Author

yuzisun commented Sep 7, 2020

/retest

@yuzisun
Copy link
Member Author

yuzisun commented Sep 7, 2020

/retest

@yuzisun yuzisun changed the title WIP: Implement KFServing v1beta1 controller Implement KFServing v1beta1 controller Sep 8, 2020
@yuzisun
Copy link
Member Author

yuzisun commented Sep 10, 2020

/retest

@yuzisun
Copy link
Member Author

yuzisun commented Sep 10, 2020

@ellistarn this is ready for review, sorry for the big PR as need to pass all the CI tests for v1beta1..

@yuzisun
Copy link
Member Author

yuzisun commented Sep 13, 2020

/retest

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

docs/samples/v1beta1/advanced/liveness_probe.yaml Outdated Show resolved Hide resolved
pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
v1 "k8s.io/api/core/v1"
)

// PodSpec is a description of a pod.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write up why we cant use it directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what changes we made.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with comments

@@ -58,21 +58,21 @@ func (t *TritonSpec) Default(config *InferenceServicesConfig) {
// GetContainers transforms the resource into a container spec
func (t *TritonSpec) GetContainer(metadata metav1.ObjectMeta, extensions *ComponentExtensionSpec, config *InferenceServicesConfig) *v1.Container {
arguments := []string{
"trtserver",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be cmd?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't use cmd here as that will overwrite the triton's entrypoint which does bunch of stuff
https://github.com/triton-inference-server/server/blob/master/nvidia_entrypoint.sh

pkg/apis/serving/v1beta1/predictor_xgboost.go Show resolved Hide resolved
&podSpec, isvc.Status.Components[v1beta1.ExplainerComponent])

if err := controllerutil.SetControllerReference(isvc, r.Service, p.scheme); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out errors.Wrapf()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! changed all places to use this.

@@ -4,7 +4,9 @@ metadata:
name: "custom-simple"
spec:
predictor:
minReplicas: 1
containers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from an end-user "spec" perspective, what does "containers" mean? Should it be something like "componens" ...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@animeshsingh this is for custom spec and If the user is going to the custom route then they should know what "containers" mean, we do not want to reinvent the wheel to come up another similar concept.

containers:
- image: codait/max-object-detector
name: kfserving-container
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up from previous one, should tell what "component" is it? predictor/explainer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actually is the container name, should be optional.

@yuzisun
Copy link
Member Author

yuzisun commented Sep 18, 2020

/retest

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a beast to review. Good starting point as far as I'm concerned

@yuzisun
Copy link
Member Author

yuzisun commented Sep 21, 2020

/retest

1 similar comment
@yuzisun
Copy link
Member Author

yuzisun commented Sep 21, 2020

/retest

@yuzisun
Copy link
Member Author

yuzisun commented Sep 21, 2020

/retest

@ellistarn
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ellistarn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit deba8df into kserve:master Sep 21, 2020
@yuzisun yuzisun deleted the v1beta1 branch January 18, 2021 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for multi CRD versions and conversion webhook between v1alpha2 and v1beta1
6 participants