Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Kserve Unmanaged Installation #860

Merged

Conversation

VaishnaviHire
Copy link
Member

@VaishnaviHire VaishnaviHire commented Feb 9, 2024

Description

https://issues.redhat.com/browse/RHOAIENG-2951

How Has This Been Tested?

  1. Install operator using catalogsource quay.io/vhire/opendatahub-operator-catalog:v2.7.0
  2. Create default DSCI
  3. Create DSC with following spec

    kserve:
      managementState: Managed
      serving:
        ingressGateway:
          certificate:
            type: SelfSigned
        managementState: Unmanaged
        name: knative-serving

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@openshift-ci openshift-ci bot requested a review from etirelli February 9, 2024 22:21
@VaishnaviHire VaishnaviHire requested review from Jooho and danielezonca and removed request for etirelli February 9, 2024 22:21
Copy link
Contributor

@Jooho Jooho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main changes looks good to me

@Jooho
Copy link
Contributor

Jooho commented Feb 10, 2024

I didn't test this yet but the change looks good.

@zdtsw
Copy link
Member

zdtsw commented Feb 10, 2024

i did not see this PR but made an early comments on the sync PR red-hat-data-services#192
copy-paste it here:

this is for:
when serving is Managed + servicemesh is not-Managed, we do a check if these 2 operators are installed?

but for kserve to work, even both of these 2 dependent operator will be needed
even we do not create CR from them.

(some docs does say you do not must use servicemesh but other operator but if we exclude this option here)
my understanding is, as long as we set kserve to Managed, we will need the check on these 2.

@VaishnaviHire
Copy link
Member Author

i did not see this PR but made an early comments on the sync PR red-hat-data-services#192 copy-paste it here:

this is for:
when serving is Managed + servicemesh is not-Managed, we do a check if these 2 operators are installed?

but for kserve to work, even both of these 2 dependent operator will be needed
even we do not create CR from them.

(some docs does say you do not must use servicemesh but other operator but if we exclude this option here)
my understanding is, as long as we set kserve to Managed, we will need the check on these 2.

For Unmanaged we should not prevent kserve controller deployment. Even when the prereq. operators are expected, given that the state is Unmanged, operator should go ahead with component installation

@zdtsw
Copy link
Member

zdtsw commented Feb 12, 2024

/retest

1 similar comment
@zdtsw
Copy link
Member

zdtsw commented Feb 12, 2024

/retest

@zdtsw
Copy link
Member

zdtsw commented Feb 12, 2024

I think the whole confusion comes from:
what does "Unmanaged" mean?
if user set Unmanaged to Serving, and not install Serverless operator, is it a supported case for Kserve?

@VaishnaviHire
Copy link
Member Author

The error in e2e is caused by https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/tests/e2e/dsc_creation_test.go#L188.
I will update the PR to remove this

@israel-hdez
Copy link
Contributor

israel-hdez commented Feb 12, 2024

I think the whole confusion comes from:
what does "Unmanaged" mean?
if user set Unmanaged to Serving, and not install Serverless operator, is it a supported case for Kserve?

For the time being, this is not a supported setup: KServe needs both Serverless and Service Mesh to be present in order to work properly.

However, there are talks about reducing the footprint and this may change in the near term.

@@ -108,13 +108,8 @@ func (k *Kserve) ReconcileComponent(ctx context.Context, cli client.Client, resC
return err
}
}
// check on dependent operators if all installed in cluster
dependOpsErrors := checkDependentOperators(cli).ErrorOrNil()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I remember, is that this check was left here in order to prevent KServe from being installed if either OSSM or Serverless were missing from the cluster, regardless of the management state of OSSM and Serverless. This was simply a proactive check so that we don't end up with a broken installation of KServe.

Moving this code into configreServerless doesn't make much sense to me, because configureServerless has its own protections.

So, I think it is better to, simply, fully remove checkDependentOperators function. We may want to drop a note in the docs about that KServe installation will no longer test if there is a valid OSSM and Serverless setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we can remove the checkDependentOperators function entirely. I will update the PR.

For downstream though, we had included the function in configreServerless already for 2.7. We will remove the checkDependentOperators in 2.8. The behavior will be same though,

For `Unmanaged`, operator will not check for dependent operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I had a meeting today and now I have a much better understanding about why this is needed.

I agree we can remove the checkDependentOperators function entirely. I will update the PR.
We will remove the checkDependentOperators in 2.8. The behavior will be same

This seems to be the optimal way, at the moment. I appreciate your help @VaishnaviHire .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked more into this, I think we should keep the checkDependentOperators around since it looks for OperatorCondition resource rather than CRD. OperatorCondition helps with verifying if the operator subscription is present which is not the case with crd

@VaishnaviHire
Copy link
Member Author

/cherry-pick main

@openshift-cherrypick-robot

@VaishnaviHire: once the present PR merges, I will cherry-pick it on top of main in a new PR and assign it to you.

In response to this:

/cherry-pick main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@VaishnaviHire VaishnaviHire force-pushed the fix_kserve_manual branch 2 times, most recently from 033e05e to 43d586c Compare February 19, 2024 16:03
Copy link

openshift-ci bot commented Feb 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zdtsw
Once this PR has been reviewed and has the lgtm label, please ask for approval from vaishnavihire. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@VaishnaviHire VaishnaviHire merged commit b467ed7 into opendatahub-io:incubation Feb 19, 2024
6 of 7 checks passed
@openshift-cherrypick-robot

@VaishnaviHire: new pull request created: #874

In response to this:

/cherry-pick main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

VaishnaviHire added a commit to VaishnaviHire/opendatahub-operator that referenced this pull request Feb 22, 2024
* Fix Kserve Unmanaged Installation

* Update tests for kserve

(cherry picked from commit b467ed7)
VaishnaviHire added a commit to VaishnaviHire/opendatahub-operator that referenced this pull request Feb 23, 2024
* Fix Kserve Unmanaged Installation

* Update tests for kserve

(cherry picked from commit b467ed7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants