Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auth race condition mitigation #213

Merged

Conversation

israel-hdez
Copy link
Contributor

A race condition was found that leads to odh-model-controller starting in a state not suitable for KServe, despite KServe is installed with authorization enabled.

This mitigates such condition by:

  • Always adding AuthConfigs to the schema. Adding types to the schema doesn't seem to have any bad effects, even if such types does not exist in the cluster. This prevents a "no kind is registered" error when trying to reconcile an InferenceService if Authorino setup finished after odh-model-controller booted.
  • Invoking Owns on inferenceservice_controller.go setup based on the existence of the AuthConfig CRD in the cluster, rather than based on the existence of a specific AuthorizationPolicy.

These changes, together with opendatahub-io/opendatahub-operator#1019 should mitigate/fix the race condition and odh-model-controller should properly start in a good state suitable for KServe.

Related to https://issues.redhat.com/browse/RHOAIENG-7312

How Has This Been Tested?

Three different test were done:

  • Install only ModelMesh (no service mesh, nor knative, nor authorino operators installed), and check that odh-model-controller would start properly.
  • Install KServe but skip installation of Authorino operator. Check that odh-model-controller would start properly. Also, check that a model can be deployed and inference requests work.
  • Install KServe with all prerequisites. Check that odh-model-controller would start properly. Also, check that a model can be deployed and inference requests work.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

A race condition was found that leads to odh-model-controller starting in a state not suitable for KServe, despite KServe is installed with authorization enabled.

This mitigates such condition by:
* Always adding AuthConfigs to the schema. Adding types to the schema doesn't seem to have any bad effects, even if such types does not exist in the cluster. This prevents a "no kind is registered" error when trying to reconcile an InferenceService if Authorino setup finished after odh-model-controller booted.
* Invoking `Owns` on inferenceservice_controller.go setup based on the existence of the AuthConfig CRD in the cluster, rather than based on the existence of a specific AuthorizationPolicy.

These changes, together with opendatahub-io/opendatahub-operator#1019 should mitigate/fix the race condition and odh-model-controller should properly start in a good state suitable for KServe.

Related to https://issues.redhat.com/browse/RHOAIENG-7312

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
@openshift-ci openshift-ci bot requested review from Jooho and spolti May 30, 2024 19:52
Copy link
Contributor

@VedantMahabaleshwarkar VedantMahabaleshwarkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test the changes on a cluster, but didn't see any issues on the code.
/approve

Copy link
Contributor

openshift-ci bot commented May 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, VedantMahabaleshwarkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [VedantMahabaleshwarkar,israel-hdez]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Jooho
Copy link
Contributor

Jooho commented May 31, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label May 31, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 168f0a6 into opendatahub-io:main May 31, 2024
5 checks passed
@israel-hdez israel-hdez deleted the j7312-race-condition-fix branch May 31, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants