New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-15365: *: use a filtered LIST + WATCH on Secrets for AWS STS #545
OCPBUGS-15365: *: use a filtered LIST + WATCH on Secrets for AWS STS #545
Conversation
Would like some guidance for what kinds of tests we want to add to this :) |
pkg/aws/actuator/actuator.go
Outdated
@@ -441,7 +441,7 @@ func (a *AWSActuator) syncPassthrough(ctx context.Context, cr *minterv1.Credenti | |||
} | |||
|
|||
// userPolicy param empty because in passthrough mode this doesn't really have any meaning | |||
err = a.syncAccessKeySecret(cr, accessKeyID, secretAccessKey, existingSecret, "", logger) | |||
err = a.syncAccessKeySecret(ctx, cr, accessKeyID, secretAccessKey, existingSecret, "", logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were not strictly necessary but it seemed like latent tech debt and since I was touching the client calls, I figured we could fix them.
pkg/operator/controller.go
Outdated
|
||
// AddToManager adds all Controllers to the Manager | ||
func AddToManager(m manager.Manager, explicitKubeconfig string) error { | ||
rules := clientcmd.NewDefaultClientConfigLoadingRules() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Controller-runtime does not have support for server-side-apply yet, and by far the best way for us to add labels to existing objects surgically, and without conflict problems, is using server-side-apply. I create a client-go client and thread it through in order to allow that.
}) | ||
|
||
if _, err := r.mutatingClient.Secrets(secret.Namespace).Apply(ctx, applyConfig, metav1.ApplyOptions{ | ||
Force: true, // we're the authoritative owner of this field and should not allow anyone to stomp it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k would love a gut-check that this does what I think it does, given that the applyConfig is set up the way it is - is there a single owner for all of metadata.labels
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Labels are divided by name, so you only own the labels that you're trying to set, not all the labels available. At least that's my memory of it. Is the reality different?
/hold It is not clear to me if labeling the root credentials in |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #545 +/- ##
==========================================
+ Coverage 47.84% 48.43% +0.59%
==========================================
Files 93 93
Lines 11488 11958 +470
==========================================
+ Hits 5496 5792 +296
- Misses 5359 5538 +179
+ Partials 633 628 -5
|
/retitle PORTENABLE-526: *: label the secrets we interact with |
After some chats with @abutcher and @deads2k :
Since STS feature flag does not exist before that PR merges, we don't have to worry about pulling in old data / upgrades / etc. |
sgtm but why can't we mutate the kube-system cred (just to label it)? because it may be user provided/user-managed? alternatively, if it's not associated w/ credreq, why do we need to label it? |
It's user-provided, so David says we can't touch it. And we would want to label it because controller-runtime makes it exceedingly difficult to have a consistent experience when every object you want to GET is not in your cache, and the factoring today would require us to have two caches |
@stevekuznetsov: This pull request references Jira Issue OCPBUGS-15365, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@stevekuznetsov: This pull request references Jira Issue OCPBUGS-15365, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@stevekuznetsov: This pull request references Jira Issue OCPBUGS-15365, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test e2e-aws-ovn |
The E2E is failing with:
Which is definitely broken because of my changes, looking into it :) |
407f0bf
to
553faba
Compare
553faba
to
e5706ed
Compare
If we're using a filtered LIST + WATCH for Secrets to do our work on CredentialRequests, we also need to open a second LIST + WATCH for the root credential, as this won't have the labels we add to our own secrets. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
e5706ed
to
ae2828f
Compare
Was sending a label selector instead of a field selector, oops. Should be good now. |
/test e2e-aws-manual-oidc |
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
1f3d8b4
to
caf857f
Compare
/test e2e-aws-manual-oidc |
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
8336770
to
c0b4a13
Compare
/test e2e-aws-manual-oidc |
/lgtm |
/hold cancel |
@abutcher will need an approval as well! |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abutcher, stevekuznetsov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@stevekuznetsov: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@stevekuznetsov: Jira Issue OCPBUGS-15365: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-15365 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The status quo for this controller is to LIST + WATCH all Secrets on the cluster. This consumes
more resources than necessary on clusters where users put other data in Secrets themselves, as we
hold that data in our cache and never do anything with it. The reconcilers mainly need to react to
changes in Secrets created for CredentialRequests, which they control and can label, allowing us
to filter the LIST + WATCH down and hold the minimal set of data in memory. However, two caveats:
and we need to watch those, but we can't label them
Secrets labelled
We could solve the second issue with an interim release of this controller that labels all previous
Secrets, but does not restrict the watch stream.
Due to the way that controller-runtime closes over the client/cache concepts, it's difficult to
solve the first issue, though, since we'd need two sets of clients and caches, both for Secrets,
and ensure that we use one for client access to Secrets we're creating or mutating and the other
when we're interacting with admin credentials. Not impossible to do, but tricky to implement and
complex.
Until we undertake that effort, we apply a simplification to the space: only when AWS STS mode is
enabled, we will try to filter the LIST + WATCH. This mode is brand new, so we can be reasonably
sure that there are no previous secrets on the cluster, and, we make the filtering best-effort
in order to check if that assumption held. Second, AWS STS mode only runs in clusters without
admin credentials, so if we apply the filter, we should not see failures downstream from clients
that hope to see those objects but can't.