-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(authz): soft opt-in for authz capability #183
feat(authz): soft opt-in for authz capability #183
Conversation
Instead of failing when Authorino operator is not present, features releated to this capability won't be installed. This allows to handle previously existings RHOAI installation where KServe does not have yet authorization module bundled. Once Authorino operator is installed, there needs to be reconcile triggered so that authz-capability features can be configured. How to test - use custom images for odh-operator (and odh-model-controller) - quay.io/bmajsak/opendatahub-operator:dev-authz-soft-opt-in - quay.io/bmajsak/odh-model-controller:latest (reference updated in the odh operator image) - note: odh-operator image manifests point to `main` branch of odh-model-controller - install all prerequisites BUT Authorino Operator - notice there's no `auth-provider` ns nor authorino-related configuration for service mesh - enable KServe in DSC - test sample isvc (e.g. by using [this script](https://gist.githubusercontent.com/israel-hdez/af374562ef9e5b9d80890aa6f0bce20d/raw/5651e4d7c35be51751049c3c2a236771e9005ace/kserve-sample-model-test.sh)) - observe 200 - install Authorino operator - trigger reconcile of DSCI and DSC so changes are applied both to cluster setup and component - restart `odh-model-controller` pod - trigger sample script again - observe 200 - check Authorino logs and see anonymous access was granted - enable authorization on isvc - i.e. `oc annotate isvc sklearn-v2-iris security.opendatahub.io/enable-auth=true` - call isvc again - observe 401 - add token to isvc call's Header - observe 200 Requires opendatahub-io/odh-model-controller#183 to work end-to-end.
Interesting failure on CI: I guess you need to sign your commit :) |
@spolti That's fixed in #184 :)
About the DCO... on it. Though I'm signing commits with pgp (and they appear as |
c8509f0
to
a2eba43
Compare
a2eba43
to
a29ab33
Compare
a29ab33
to
72d65ec
Compare
/retest |
EDIT: these were all AI hallucinations, seemingly. But at least it resulted in a side-quest for cleaning up the code a little bit :) I can change how odh-model-controller/controllers/reconcilers/kserve_inferenceservice_reconciler.go Lines 28 to 41 in 5db4925
we can make sure they're implementing At this point we can conditionally add certain reconcilers to the slice and use the flag instead. WDYT? |
72d65ec
to
5fc41fb
Compare
I should have made it draft from the get-go, as I would like your input on #183 (comment) before proceeding. |
if !utils.IsGroupRegisteredInScheme(r.client, authorinov1beta2.GroupVersion) { | ||
log.V(1).Info("Skipping authorization reconciliation, authorization capability is not set up") | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wanna make sure i understand this part:
so if odh-model-controller is only used with modelserving (as no kserve) what would happen here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you concerned about the log being confusing? This reconciler is only active when KServe is enabled so I think we should be good here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we check isAuthorinoEnabled
here as well or something similar?
AFAIK we check if the Service Mesh is enabled to allow KServe raw deployments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not have an option to set authornio to Removed or Unmanaged in operator
as long as it is installed it will be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agreed with @spolti because authorino can be installed for other reasons by users. so we should double check it. VerifyIComponentIsEnabled func that @spolti implemented can verify if kerve serverless installed or not that require Authorino operator or not. Why don't you add the func IsGroupRegisteredInScheme into this func and use VerifyIComponentIsEnabled func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic, PTAL at new version
if utils.IsGroupRegisteredInScheme(r.client, authorinov1beta2.GroupVersion) { | ||
r.log.Info("Authorino is registered, enabling Authorization capability") | ||
builder.Owns(&authorinov1beta2.AuthConfig{}) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Knative/Istio are installed on the cluster but Authorino is not installed, the odh-model-controller do not enable the authConfig reconciler, as it relies on Authorino for certain authentication functionalities. While this log shows "Authorino capability disabled", it's important to inform the user the warning message when the user attempts to enable authentication but Authorino is not installed. It should clearly explain that Authorino is required for certain authentication functionalities.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by doing this, even though KServe requires it and authorino is missing, it will not require it leading to issuer later, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic, PTAL at new version
We don't need to rush with #185 - I will finalize this version with implicit authorino scheme check and then can ship the refined version when the other PR gets merged. |
Instead of failing when Authorino operator is not present, features releated to this capability won't be installed. This allows to handle previously existings RHOAI installation where KServe does not have yet authorization module bundled. Once Authorino operator is installed, there needs to be reconcile triggered so that authz-capability features can be configured. How to test - use custom images for odh-operator (and odh-model-controller) - quay.io/bmajsak/opendatahub-operator:dev-authz-soft-opt-in - quay.io/bmajsak/odh-model-controller:latest (reference updated in the odh operator image) - note: odh-operator image manifests point to `main` branch of odh-model-controller - install all prerequisites BUT Authorino Operator - notice there's no `auth-provider` ns nor authorino-related configuration for service mesh - enable KServe in DSC - test sample isvc (e.g. by using [this script](https://gist.githubusercontent.com/israel-hdez/af374562ef9e5b9d80890aa6f0bce20d/raw/5651e4d7c35be51751049c3c2a236771e9005ace/kserve-sample-model-test.sh)) - observe 200 - install Authorino operator - trigger reconcile of DSCI and DSC so changes are applied both to cluster setup and component - restart `odh-model-controller` pod - trigger sample script again - observe 200 - check Authorino logs and see anonymous access was granted - enable authorization on isvc - i.e. `oc annotate isvc sklearn-v2-iris security.opendatahub.io/enable-auth=true` - call isvc again - observe 401 - add token to isvc call's Header - observe 200 Requires opendatahub-io/odh-model-controller#183 to work end-to-end.
* feat(authz): conditionally enable authorization capability Instead of failing when Authorino operator is not present, features releated to this capability won't be installed. This allows to handle previously existings RHOAI installation where KServe does not have yet authorization module bundled. Once Authorino operator is installed, there needs to be reconcile triggered so that authz-capability features can be configured. How to test - use custom images for odh-operator (and odh-model-controller) - quay.io/bmajsak/opendatahub-operator:dev-authz-soft-opt-in - quay.io/bmajsak/odh-model-controller:latest (reference updated in the odh operator image) - note: odh-operator image manifests point to `main` branch of odh-model-controller - install all prerequisites BUT Authorino Operator - notice there's no `auth-provider` ns nor authorino-related configuration for service mesh - enable KServe in DSC - test sample isvc (e.g. by using [this script](https://gist.githubusercontent.com/israel-hdez/af374562ef9e5b9d80890aa6f0bce20d/raw/5651e4d7c35be51751049c3c2a236771e9005ace/kserve-sample-model-test.sh)) - observe 200 - install Authorino operator - trigger reconcile of DSCI and DSC so changes are applied both to cluster setup and component - restart `odh-model-controller` pod - trigger sample script again - observe 200 - check Authorino logs and see anonymous access was granted - enable authorization on isvc - i.e. `oc annotate isvc sklearn-v2-iris security.opendatahub.io/enable-auth=true` - call isvc again - observe 401 - add token to isvc call's Header - observe 200 Requires opendatahub-io/odh-model-controller#183 to work end-to-end. * feat: introduces generic status reporter struct Introducing generic struct Reporter to handle status/condition updates. It evolved from `updateStatus` funcs available in DSCI and DSC controllers (and these are also moved to a single generic func now). New reporter struct has single responsibility - it allows to define how condition should be updated. Developer defines it in `CalculateCondition` closure where optional error will be passed for inspection. This closure should return known from previous incarnation `update func(saved)` which will append target object's conditions. The only difference is access to optional error. The main goal for introducing this approach is to avoid catch-error-on-defer, where we rely on named return parameter and mutate condition directly in the function which performs actions based on the resource desired state. Example: ```golang func createReporter(condition *conditionsv1.Condition) *status.Reporter[*dsciv1.DSCInitialization] { return status.NewStatusReporter[*dsciv1.DSCInitialization]( func(err error) status.SaveStatusFunc[*dsciv1.DSCInitialization] { return func(saved *dsciv1.DSCInitialization) { if err != nil { condition.Status = corev1.ConditionFalse condition.Message = err.Error() condition.Reason = status.CapabilityFailed var missingOperatorErr *feature.MissingOperatorError if errors.As(err, &missingOperatorErr) { condition.Reason = status.MissingOperatorReason } } conditionsv1.SetStatusCondition(&saved.Status.Conditions, *condition) } }, ) } ``` Then to call it: ```golang func (r *DSCInitializationReconciler) doServiceMeshStuff(instance *dsciv1.DSCInitialization) error { reporter := createReporter(&conditionsv1.Condition{ Type: status.CapabilityServiceMesh, Status: corev1.ConditionFalse, Reason: status.ConfiguredReason, Message: "Service Mesh configured", }) serviceMeshErr := createServiceMesh(instance) _, reportError := reporter.ReportCondition(r.Client, instance, serviceMeshErr) return reportError } ``` Next step: migrate FeatureTracker to this new API Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * feat(authz): reports missing authorino operator without failure When Authorino Operator is missing the reconcile will continue without throwing an error about missing pre-requisite and asking user to install it. This fact is instead reported as DSCI.status condition: ```yaml - lastHeartbeatTime: "2024-04-04T16:32:22Z" lastTransitionTime: "2024-04-04T16:24:38Z" message: Authorino operator is not installed on the cluster, skipping authorization capability reason: MissingOperator status: "False" type: CapabilityServiceMeshAuthorization ``` This allows existing KServe installations which do not have authorization capability to migrate smoothly and continue working without authz layer - so to allow for soft opt-in for Authz layer leveraging Authorino, as there is no concensus on automated installation of this dependent operator on behalf of the user. This commit introduces a notion of capability, though not fully fleshed out as a concept - only condition type for now. It is necessery to split set of service mesh features to core and authz so that they not only can be reported separately but can be handled on their own. This is needed for so called "soft opt-in" mentioned above. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * fix(error): uses single %w wrap instead of 2 It seems that go1.19 does not support wrapping two errors in the same fmt.Errorf call, whereas go1.20 does :) Simplifying the code to have single error tree and make go1.19 compiler happy. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * chore: makes Reporter self-contained - adds client and instance to which status is attached as part of the struct - changes how HandleWithReporter is used - now it's a wrapper around Apply/Delete which calls reporter underneath instead Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * chore: uses `Println` to WARN Co-authored-by: Wen Zhou <wenzhou@redhat.com> * chore: minor godoc and err improvements Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> --------- Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> Co-authored-by: Wen Zhou <wenzhou@redhat.com>
This change handles "soft opt-in" for authorization capability handled by Authorino. The solution is based on Conditions exposed in DSCInitialization resource, as provided in opendatahub-io/opendatahub-operator#949. When the condition of type `CapabilityServiceMeshAuthorization` has a reason `MissingOperator` it will assume authorino/authorization module is not present and therefore does not handle AuthConfigs - leaving models not secured as it was for the previous versions. This approach allows to softly opt-in for Authorization without blocking the upgrade. Any other reason means that ODH Operator is in not happy state and will eventually reconcile so that Authorino will start working. The logic consists of the following steps: * during the controller bootstrap it will check separately if ServiceMesh is enabled and Authorino is present (latter by inspecting the status.condition) * when defining watches (using builder instance for OpenshiftReconciler) it will conditionally set a watch for `AuthConfig` * it guards AuthConfig reconciler from executing based on the "authorization capability". This prevents constant failures due to `AuthConfig` not being defined for the cluster. > [!NOTE] > As it is the case with original logic brought in opendatahub-io#181, it will require restart of the controller pod to set a watch on AuthConfig resource when Authorino is installed in the cluster. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com>
c5daa2a
to
48b2ef7
Compare
Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com>
Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com>
Authorization verificationSteps
SUCCESS! 🍺 🎆 |
when Authorino operator is missing in the cluster no AuthConfig should be created. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bartoszmajsak, Jooho, spolti The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…hub-io#949) * feat(authz): conditionally enable authorization capability Instead of failing when Authorino operator is not present, features releated to this capability won't be installed. This allows to handle previously existings RHOAI installation where KServe does not have yet authorization module bundled. Once Authorino operator is installed, there needs to be reconcile triggered so that authz-capability features can be configured. How to test - use custom images for odh-operator (and odh-model-controller) - quay.io/bmajsak/opendatahub-operator:dev-authz-soft-opt-in - quay.io/bmajsak/odh-model-controller:latest (reference updated in the odh operator image) - note: odh-operator image manifests point to `main` branch of odh-model-controller - install all prerequisites BUT Authorino Operator - notice there's no `auth-provider` ns nor authorino-related configuration for service mesh - enable KServe in DSC - test sample isvc (e.g. by using [this script](https://gist.githubusercontent.com/israel-hdez/af374562ef9e5b9d80890aa6f0bce20d/raw/5651e4d7c35be51751049c3c2a236771e9005ace/kserve-sample-model-test.sh)) - observe 200 - install Authorino operator - trigger reconcile of DSCI and DSC so changes are applied both to cluster setup and component - restart `odh-model-controller` pod - trigger sample script again - observe 200 - check Authorino logs and see anonymous access was granted - enable authorization on isvc - i.e. `oc annotate isvc sklearn-v2-iris security.opendatahub.io/enable-auth=true` - call isvc again - observe 401 - add token to isvc call's Header - observe 200 Requires opendatahub-io/odh-model-controller#183 to work end-to-end. * feat: introduces generic status reporter struct Introducing generic struct Reporter to handle status/condition updates. It evolved from `updateStatus` funcs available in DSCI and DSC controllers (and these are also moved to a single generic func now). New reporter struct has single responsibility - it allows to define how condition should be updated. Developer defines it in `CalculateCondition` closure where optional error will be passed for inspection. This closure should return known from previous incarnation `update func(saved)` which will append target object's conditions. The only difference is access to optional error. The main goal for introducing this approach is to avoid catch-error-on-defer, where we rely on named return parameter and mutate condition directly in the function which performs actions based on the resource desired state. Example: ```golang func createReporter(condition *conditionsv1.Condition) *status.Reporter[*dsciv1.DSCInitialization] { return status.NewStatusReporter[*dsciv1.DSCInitialization]( func(err error) status.SaveStatusFunc[*dsciv1.DSCInitialization] { return func(saved *dsciv1.DSCInitialization) { if err != nil { condition.Status = corev1.ConditionFalse condition.Message = err.Error() condition.Reason = status.CapabilityFailed var missingOperatorErr *feature.MissingOperatorError if errors.As(err, &missingOperatorErr) { condition.Reason = status.MissingOperatorReason } } conditionsv1.SetStatusCondition(&saved.Status.Conditions, *condition) } }, ) } ``` Then to call it: ```golang func (r *DSCInitializationReconciler) doServiceMeshStuff(instance *dsciv1.DSCInitialization) error { reporter := createReporter(&conditionsv1.Condition{ Type: status.CapabilityServiceMesh, Status: corev1.ConditionFalse, Reason: status.ConfiguredReason, Message: "Service Mesh configured", }) serviceMeshErr := createServiceMesh(instance) _, reportError := reporter.ReportCondition(r.Client, instance, serviceMeshErr) return reportError } ``` Next step: migrate FeatureTracker to this new API Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * feat(authz): reports missing authorino operator without failure When Authorino Operator is missing the reconcile will continue without throwing an error about missing pre-requisite and asking user to install it. This fact is instead reported as DSCI.status condition: ```yaml - lastHeartbeatTime: "2024-04-04T16:32:22Z" lastTransitionTime: "2024-04-04T16:24:38Z" message: Authorino operator is not installed on the cluster, skipping authorization capability reason: MissingOperator status: "False" type: CapabilityServiceMeshAuthorization ``` This allows existing KServe installations which do not have authorization capability to migrate smoothly and continue working without authz layer - so to allow for soft opt-in for Authz layer leveraging Authorino, as there is no concensus on automated installation of this dependent operator on behalf of the user. This commit introduces a notion of capability, though not fully fleshed out as a concept - only condition type for now. It is necessery to split set of service mesh features to core and authz so that they not only can be reported separately but can be handled on their own. This is needed for so called "soft opt-in" mentioned above. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * fix(error): uses single %w wrap instead of 2 It seems that go1.19 does not support wrapping two errors in the same fmt.Errorf call, whereas go1.20 does :) Simplifying the code to have single error tree and make go1.19 compiler happy. Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * chore: makes Reporter self-contained - adds client and instance to which status is attached as part of the struct - changes how HandleWithReporter is used - now it's a wrapper around Apply/Delete which calls reporter underneath instead Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> * chore: uses `Println` to WARN Co-authored-by: Wen Zhou <wenzhou@redhat.com> * chore: minor godoc and err improvements Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> --------- Signed-off-by: bartoszmajsak <bartosz.majsak@gmail.com> Co-authored-by: Wen Zhou <wenzhou@redhat.com> (cherry picked from commit 93e2126)
This change handles "soft opt-in" for authorization capability handled by Authorino.
The solution is based on Conditions exposed in DSCInitialization resource, as provided in opendatahub-io/opendatahub-operator#949.
When the condition of type
CapabilityServiceMeshAuthorization
has a reasonMissingOperator
it will assume authorino/authorization module is not present and therefore does not handle AuthConfigs - leaving models not secured as it was for the previous versions. This approach allows to softly opt-in for Authorization without blocking the upgrade. Any other reason means that ODH Operator is in not happy state and will eventually reconcile so that Authorino will start working.The logic consists of the following steps:
AuthConfig
AuthConfig
not being defined for the cluster.Note
As it is the case with original logic brought in #181, it will require restart of the controller pod to set a watch on AuthConfig resource when Authorino is installed in the cluster.
Custom image:
quay.io/bmajsak/odh-model-controller:authz-soft-opt-in