Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for Integrating KServe with Openshift #2853

Merged
merged 3 commits into from
Jul 7, 2023

Conversation

skonto
Copy link
Contributor

@skonto skonto commented Apr 25, 2023

What this PR does / why we need it:

This adds the following fixes for running KServe on Openshift:

  • Adds the capability to inject custom annotations to the ksvc required by the Openshift Serverless (deploys Knative Serving) and Service Mesh Integration.
    The annotation for example serving.knative.openshift.io/enablePassthrough: “true” needs to be added at the ksvc level for inference services and graphs. This will not allow to add arbitrary annotations, thus should not change current semantics.
  • Adds the ability to run the storage injector with user id 1337 in order to solve the issue when Istio cni and dns proxy are enabled.
    Right now the security context is copied from the user container but we cant use the same user id there for many reasons eg. is not allowed, security etc. The latter applies if we use pod security context or some other workaround.
    The uid can be set to an arbitrary value using the annotation: serving.kserve.io/storage-initializer-uid: "...".
    Note that on Openshift that uid is not 1337 but project_id_range+1, so we need this to be configurable.
    Note: In a future PR we should be able to setup the Knative Serving init-containers feature flag and avoid the custom injection done or provide some template to allow the user to configure that part.

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Tested on OCP 4.12 with the following inference service:

oc apply -n test -f - <<EOF
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sklearn-from-uri
  annotations:
    sidecar.istio.io/inject: "true"
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
    serving.knative.openshift.io/enablePassthrough: “true”
spec:
  predictor:
    sklearn:
      storageUri: https://github.com/skonto/serverless-guides/blob/main/kserve/kfserving-examples/models/sklearn/1.0/model/model.joblib?raw=true
EOF

The manager deployment was started with the right env var.

Special notes for your reviewer:

No

Release note:

 Openshift integration improvements.

@skonto skonto changed the title [WIP] Fixes for Integrating KServe with Openshift Fixes for Integrating KServe with Openshift Apr 26, 2023
@skonto
Copy link
Contributor Author

skonto commented Apr 26, 2023

@yuzisun @alexagriffith gentle ping.

@skonto
Copy link
Contributor Author

skonto commented Apr 27, 2023

/assign @yuzisun

@skonto skonto changed the title Fixes for Integrating KServe with Openshift [wip] Fixes for Integrating KServe with Openshift Apr 27, 2023
@skonto
Copy link
Contributor Author

skonto commented Apr 27, 2023

TODO: support scenario where OCP SM uses a uid other than 1337 to run the Istio side car.

@ReToCode
Copy link
Contributor

ReToCode commented May 2, 2023

TODO: support scenario where OCP SM uses a uid other than 1337 to run the Istio side car.

We could make it configurable? The default in Istio is 1337, but I think users could overwrite it.

@skonto
Copy link
Contributor Author

skonto commented May 7, 2023

We could make it configurable? The default in Istio is 1337, but I think users could overwrite it.

Yes that is my idea too given that OCP requires a project based id that varies.

@skonto skonto changed the title [wip] Fixes for Integrating KServe with Openshift Fixes for Integrating KServe with Openshift May 11, 2023
@skonto
Copy link
Contributor Author

skonto commented May 11, 2023

I have updated the PR. I am using an annotation to pass the uid for the storage initializer.
This can be refactored if a more generic solution is added for adjusting that container's properties.
The annotations on Openshift can be used as follows:

kubectl apply -n test -f - <<EOF
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sklearn-from-uri
  annotations:
    sidecar.istio.io/inject: "true"
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
    serving.knative.openshift.io/enablePassthrough: "true"
    serving.kserve.io/storage-initializer-uid: "1000720001"
spec:
  predictor:
    sklearn:
      storageUri: https://github.com/skonto/serverless-guides/blob/main/kserve/kfserving-examples/models/sklearn/1.0/model/model.joblib?raw=true
EOF

@skonto
Copy link
Contributor Author

skonto commented May 11, 2023

@yuzisun pls review, this is meant to make the Openshift guide added lately to work out of the box with Istio.

@skonto
Copy link
Contributor Author

skonto commented May 25, 2023

@alexagriffith @yuzisun gentle ping.

@yuzisun yuzisun requested a review from ckadner May 31, 2023 14:35
@yuzisun
Copy link
Member

yuzisun commented May 31, 2023

@alexagriffith and I am not too familiar with OpenShift, @ckadner can you help review this PR?

@ckadner ckadner assigned ckadner and unassigned yuzisun Jun 5, 2023
@ckadner ckadner requested a review from rafvasq June 5, 2023 21:45
@ckadner
Copy link
Member

ckadner commented Jun 5, 2023

@skonto -- I have not deployed KServe in while, and not on OpenShift. Can you point me to how you deployed and configured your cluster? You can find me on Slack @ckadner (both on Kubeflow and the IBM workspaces), if that makes it easier.

@yuzisun
Copy link
Member

yuzisun commented Jun 12, 2023

@skonto Can you help sign off the commit following https://github.com/kserve/kserve/pull/2853/checks?check_run_id=13400347618 ?

@skonto skonto force-pushed the fixes_for_ocp branch 2 times, most recently from 169556d to 6793200 Compare June 14, 2023 14:12
@skonto
Copy link
Contributor Author

skonto commented Jun 14, 2023

@yuzisun I think its ready now.

@@ -108,6 +109,14 @@ func createKnativeService(componentMeta metav1.ObjectMeta,
}
}

// Allow custom annotations for ksvcs that start with serving.knative but not part of serving.knative.dev group name.
for aKey, _ := range annotations {
if !strings.HasPrefix(aKey, constants.KnativeServingAPIGroupName) && strings.HasPrefix(aKey, constants.KnativeServingAPIGroupNamePrefix) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to just check the full api group name serving.knative.openshift.io?

Copy link
Contributor Author

@skonto skonto Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzisun I thought about that but I didnt want to add vendor specific annotations, trying to keep it neutral for future use as well. Do you want me to change that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skonto I think this is no longer needed as we now allow component level labels and annotations, see #2925

Copy link
Contributor Author

@skonto skonto Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzisun unofrtunately that is per component. The requirement is to set this at the ksvc level, because the OCP Serverless reconciler expects that annotation (serving.knative.openshift.io/enablePassthrough: "true") to be at that level.
For example here is the diff between a kserve ksvc (does not work) and a regular ksvc that works as expected (I tried this using the new feature).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, l'd prefer using a more explicit list, we can add serving.knative.openshift.io/enablePassthrough to managedKsvcAnnotations ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is possible. I will update it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzisun done.

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
@ReToCode
Copy link
Contributor

ReToCode commented Jul 6, 2023

@yuzisun , @alexagriffith would you mind taking another look? We'd like to do some additional patching in opendatahub-io/kserve which is based on these changes.

DefaultMinReplicas = 1
ControllerLabelName = KServeName + "-controller-manager"
DefaultMinReplicas = 1
IstioSidecarUIDAnnotationKey = KServeAPIGroupName + "/storage-initializer-uid"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning to make this configurable here also without having to add this annotation to every inference service?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "issue" is, that this is specific to the namespace where the service is deployed. At the moment there is no better way than to specify it per InferenceService. We're working with the istio folks to hopefully get rid of the underlying issue.

@yuzisun
Copy link
Member

yuzisun commented Jul 6, 2023

@yuzisun , @alexagriffith would you mind taking another look? We'd like to do some additional patching in opendatahub-io/kserve which is based on these changes.

Looks good, do we need more documentation updates?

@ReToCode
Copy link
Contributor

ReToCode commented Jul 6, 2023

Looks good, do we need more documentation updates?

Yes we should definitely update https://github.com/kserve/kserve/blob/master/docs/OPENSHIFT_GUIDE.md#prerequisites. Is it ok if I do a followup PR for this?

@yuzisun
Copy link
Member

yuzisun commented Jul 7, 2023

Thanks @skonto @ReToCode ! We are including this to the upcoming 0.11 release

/lgtm
/approve

@kserve-oss-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: skonto, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kserve-oss-bot kserve-oss-bot merged commit ee0ff96 into kserve:master Jul 7, 2023
56 checks passed
israel-hdez pushed a commit to opendatahub-io/kserve that referenced this pull request Jul 12, 2023
* fixes for SM and OCP

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

* updates

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

* add pass through annotation support

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

---------

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
Iamlovingit pushed a commit to Iamlovingit/kserve that referenced this pull request Oct 1, 2023
* fixes for SM and OCP

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

* updates

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

* add pass through annotation support

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>

---------

Signed-off-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
Signed-off-by: iamlovingit <freecode666@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants