Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to create shared index informers with custom key functions #114321

Closed
wants to merge 1 commit into from

Conversation

vincepri
Copy link
Member

@vincepri vincepri commented Dec 6, 2022

Signed-off-by: Vince Prignano vince@prigna.com

What type of PR is this?

/kind feature

What this PR does / why we need it:

In the current state of things, a shared informer can only ever be created with the default KeyFunc which calls the DeletionHandlingMetaNamespaceKeyFunc function. In an effort to allow more customizability of the KeyFunc this PR adds a new exported function that allows users to pass in a custom KeyFunc, which is then wrapped with the DeletionHandling logic.

This allows better compatibility when using custom indexers and key function logic, for example if I want to partition the data in the store by regions or availability zones, I can setup custom indexers, although the moment a key is retrieved through GetByKey, the partitioning logic doesn't work anymore, given that the items are keyed with namespace/name.

Note: This PR is a starting point, I'll happily take other paths to achieve the same, and add tests once we agree on a design.

/sig api-machinery

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

The PR above is needed in controller runtime for better support for controllers spanning across multiple clusters, see this proposal which was accepted a while back.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 6, 2022
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.26 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.26.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Tue Dec 6 21:54:24 UTC 2022.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Dec 6, 2022
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 6, 2022
@vincepri
Copy link
Member Author

vincepri commented Dec 7, 2022

cc @alvaroaleman

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

I don't understand how this can help non-KCP clusters? I think people need to make an informer per cluster and not mix multiple cluster's objects up in the same cache(s).

@fedebongio
Copy link
Contributor

/assign @lavalamp
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 8, 2022
@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

I don't understand how this can help non-KCP clusters? I think people need to make an informer per cluster and not mix multiple cluster's objects up in the same cache(s).

@lavalamp Apart from the KCP project, in SIG Cluster Lifecycle we've been talking of better supporting controllers that have to watch and list objects across every Cluster API cluster. For Cluster API specifically, we're looking to support ways to watch and list multiple clusters that a controller has access to, for example by RBAC permissions or namespace boundary.

In turn, in Controller Runtime we'd have to have ways to know where the data is coming from. Note that this might be per-cluster, but also have other partitioning information information in the key and cache.

Those are the main use cases, apart from those, the change in and of itself seems reasonable to avoid forks or copy-paste these utilities around; there are many precedents of allowing more customizations and passing in options. Is there a specific reason why we'd want to disallow customizing key functions in the informers specifically?

cc @fabriziopandini @alvaroaleman

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

better supporting controllers that have to watch and list objects across every Cluster API cluster

OK but why would such objects get mixed in the same cache such that a better sharding function would be useful?

It sounds like a major anti-pattern to me, any bug in the cache handling could cause data from one cluster to be sent to another cluster (ugly CVE potential!). I think there needs to be a more robust separation system.

@stevekuznetsov
Copy link
Contributor

@lavalamp is there any novel mechanism for CVE there on top of what already exists with cross-namespace items in the cache? FWIW if informers expose GetByKey() - they do - seems weird to have to implicitly assume the keyfunc, especially since pretty much everywhere else in the system you get to choose what key function(s) you would like to use.

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

@lavalamp Allowing different options to customize the internal machineries has been a normal pattern in the community. Can you clarify why such a change doesn't benefit the community given the use cases provided?

cc @fedebongio

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

  • I don't see how stuffing multiple objects in the same cache would benefit the cluster api use case -- it seems like that would at least break relisting etc. I think to work on this you first should make a cohesive design for that problem.
  • I personally don't consider namespaces a good security boundary, but cluster boundaries are. And even if you don't agree with that, you'll probably agree that cluster boundary is more important than a namespace boundary.
  • "clarify why such a change doesn't benefit the community" -> I think it both doesn't help solve the use case you mentioned, and is a footgun. You can potentially convince me at least about the former with a design & alternatives.

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

I don't see how stuffing multiple objects in the same cache would benefit the cluster api use case -- it seems like that would at least break relisting etc. I think to work on this you first should make a cohesive design for that problem.

If I have the ability from Cluster API to watch and list resources from multiple clusters, with a proxy of sorts through the cluster registry, I have to be able to understand where an object is coming from.

I personally don't consider namespaces a good security boundary, but cluster boundaries are. And even if you don't agree with that, you'll probably agree that cluster boundary is more important than a namespace boundary.

Sure! Although we have multiple needs to do so in Cluster API today, and we'd like that applications built on top of it to achieve similar ways in a more unified approach through Controller Runtime. For these applications running on what we call a management cluster, they already have access to multiple clusters through the Cluster API's kubeconfig, there is no more no less access. This is really just a way to improve our controllers watching multiple clusters, potentially build a way to proxy multiple watch/list calls, and streamline our fleet management story.

"clarify why such a change doesn't benefit the community" -> I think it both doesn't help solve the use case you mentioned, and is a footgun. You can potentially convince me at least about the former with a design & alternatives.

Just to give some feedback here, again if you put away the use cases at hand, why wouldn't we want to allow the customization of a single field in the shared index informer?

This is a backward compatible change that avoids copying code and duplicating machinery.

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

Looking through the commit in this package, a similar but way more extensive change was done to add support for Transformer functions:

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

If I have the ability from Cluster API to watch and list resources from multiple clusters, with a proxy of sorts through the cluster registry, I have to be able to understand where an object is coming from.

I agree with this but I don't see how this PR is a step on the way to solving this problem.

This is really just a way to improve our controllers watching multiple clusters

I don't understand how this change helps. You have some knowledge I don't and are talking as if this is an established fact; I have been asking for an explanation but I keep hearing this repeated as an assertion without additional details.

Maybe I am not being clear; "I don't understand X" is literal and means "therefore please explain it to me, link documents, etc"

Just to give some feedback here, again if you put away the use cases at hand, why wouldn't we want to allow the customization of a single field in the shared index informer?

I am not at all opposed to customization in the abstract. The problem is, every expansion of the surface area increases the burden on people maintaining this (testing, bugs, etc), and on people using it. So, I am obligated to make sure this ongoing cost has a corresponding benefit, to a reasonably large fraction of users.

  • #107507 stated a very clear justification (save a bunch of memory) and it was very clear how the change was necessary to achieve that goal (need a place to clear a field).
  • #104300 lacks a clear justification like the one I'm asking for and I would have said no until one was added, but I wasn't involved and presumably the reviewer knew what it would be used for

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

I am not at all opposed to customization in the abstract. The problem is, every expansion of the surface area increases the burden on people maintaining this (testing, bugs, etc), and on people using it. So, I am obligated to make sure this ongoing cost has a corresponding benefit, to a reasonably large fraction of users.

Perfectly fine thought, we had similar issues in SIG Cluster Lifecycle. As I mentioned to @fedebongio, and as we discussed back in June/July, I'd be more than happy to step up and help in this and other areas, and mentor more people to help. If the maintenance burden is the key point, I'm happy to be added as a reviewer and eventually maintainer in this and every other area that need help. Does that help with your concerns?

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

Maintenance help is of course greatly appreciated, but my primary concern is whether this actually helps with cluster API's stated goals or not; please give more detail. I am willing to pay the cost for features that will be broadly used and helpful!

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

I agree with this but I don't see how this PR is a step on the way to solving this problem.
I don't understand how this change helps. You have some knowledge I don't and are talking as if this is an established fact; I have been asking for an explanation but I keep hearing this repeated as an assertion without additional details.

Preface

  • In Controller Runtime we use SharedInformerIndex, it currently partitions the data by Namespace only. I can add new Indexers when creating a SharedInformerIndex, but I cannot customize the key functions directly, which is how we're here.
  • Cluster API uses Controller Runtime, and we encourage every controller author to also do so.
  • Cluster API has a concept of Management Cluster, which is a cluster that hosts the Cluster API controller which then create Workload Cluster(s), this can be seen as a simple tree with a root and many child leaves.
  • The Management Cluster has access to every Workload Cluster, and watches certain objects in every child.

Details

  • We're exploring the option of creating controllers that interface with many clusters through a proxy. Specifically, controllers should be able to read data from many clusters, and write back to a single one whenever needed. For example, if I want to build a management-cluster based IPAM controller, I want to have the ability to expose certain CRDs in the workload clusters to request an IP pool, and the controller running in the management cluster should act upon those.
  • In the current state of the world, to achieve something similar to the above I'd be running N number of caches and controllers one for each cluster. Each controller would then have to be in a different manager altogether and pod, the caches wouldn't be shared and there would be N caches with duplicated data. Effectively, like the other PR linked above, the approach of sharing a cache between clusters, controllers and manager, saves memory.
  • In this extensibility model we're considering to have a single shared cache amongst many clusters and custom-partitioning the cache and informers, events coming from a specific Cluster would go into the Controller Runtime reconciliation loop with information where they came from; the reconciler then would act on those object and know which cluster to send write requests to, for example to update Status or Condition fields.

@lavalamp
Copy link
Member

lavalamp commented Dec 8, 2022

the caches wouldn't be shared and there would be N caches with duplicated data.

The data is not actually duplicated, each leaf cluster's object differs at least in UID.

The change you're asking for won't help the data get de-duplicated, it will help you put it in one cache but it won't reduce memory usage as there will still be N copies of each object (because each object is actually different).

If you really intend to hook reflectors/informers from different clusters up to the same cache, I am pretty sure this is not the only code that would need to change to get that to work.

The words you're using make the design sound very hypothetical still, I think generally we ask for things to go in the other order; demonstration / proof of concept before upstreaming. Does that seem reasonable? I don't think we'd want to put this in and then have Cluster API discover a better technique and not use it after all.

@vincepri
Copy link
Member Author

vincepri commented Dec 8, 2022

The change you're asking for won't help the data get de-duplicated, it will help you put it in one cache but it won't reduce memory usage as there will still be N copies of each object (because each object is actually different).

If we have a controller watching a multitude of clusters, which then flow into a single Controller Runtime cache/reconciler/client, we have a single manager running multiple controllers achieving different goals, all reading the GVKs in question from a shared cache.

In the current world, to achieve a similar goal, I'd need to deploy a single manager (in a pod) for each single Workload cluster, then have hacky ways to read also data from the Management cluster due to the lack of multi-cluster support in Controller Runtime.

If you really intend to hook reflectors/informers from different clusters up to the same cache, I am pretty sure this is not the only code that would need to change to get that to work.

Probably not, but again this is a valid use case that we're exploring and one that would simplify fleet management a lot. You'll have my full support in both reviewing and maintaining any code I contribute going forward, and help with maintenance tasks as well.

The words you're using make the design sound very hypothetical still, I think generally we ask for things to go in the other order; demonstration / proof of concept before upstreaming. Does that seem reasonable? I don't think we'd want to put this in and then have Cluster API discover a better technique and not use it after all.

I'm confused at this point what else you'd need for this specific change. Regardless of Cluster API, using the Shared Informers with a custom function can be beneficial for other projects as well in the future. Controller Runtime can use this now though.

Throughout this PR I've provided plenty of reasons and explanations why such a change would be beneficial, exploratory or not, like I mentioned in the PR description:

This allows better compatibility when using custom indexers and key function logic, for example if I want to partition the data in the store by regions or availability zones, I can setup custom indexers, although the moment a key is retrieved through GetByKey, the partitioning logic doesn't work anymore, given that the items are keyed with namespace/name.

In the current state of things, only half the functionality is customizable, the other half isn't.


I'd like to point out again that other PRs linked above have had similar reasons and more extensive changes than the one proposed and they didn't get into as many details that are being asked in this case; and no demonstration nor PoC, nor data, nor extreme justifications were demanded.

@deads2k
Copy link
Contributor

deads2k commented Dec 9, 2022

We're exploring the option of creating controllers that interface with many clusters through a proxy.

As has come up multiple times over the years from sig-multi-cluster, sig-apimachinery, and sig-arch, the k/k repo (and by extension k/client-go) is focused on a single kubernetes clusters. Manipulating core primitives to support multi-cluster uses cases is an anti-goal and adds complexity to core. If new features are not useful for the mission of k/k, they do not belong in this library and can be built on top of our existing primitives in another library.

The idea submitted in this PR appears to add risk on the interface we have today, without providing a clear benefit. If another library wishes to build on top, they can certain do so, but I don't think embedding here is the appropriate level.

@vincepri
Copy link
Member Author

vincepri commented Dec 9, 2022

Manipulating core primitives to support multi-cluster uses cases is an anti-goal and adds complexity to core.

We're not adding or manipulating core primitives to add multi-cluster support within this library. That was the underlying use case that was asked for, but it's besides the point of customizability of an exposed library, which in this case it's a cache for clients.

If new features are not useful for the mission of k/k, they do not belong in this library and can be built on top of our existing primitives in another library.

Agreed! And Controller Runtime Cluster API are ever more popular k/k sub-projects that are focused on creating controllers on one side, and fleet of clusters on the other. Again, this change proposed doesn't affect k/k in any way and I'll happily provide support for the code contributed.

The idea submitted in this PR appears to add risk on the interface we have today, without providing a clear benefit.

The benefit has been clarified over and over throughout the entire PR comments.

If another library wishes to build on top, they can certain do so, but I don't think embedding here is the appropriate level.

Are you suggesting that we as a community should prefer forks of this and other parts of Kubernetes and the inherent division that comes along with it?

And to be very clear: Controller Runtime, which builds on top of client-go, cannot build anything on top to support multi-cluster use cases without the option proposed in this PR. We're not looking to add support for multi-cluster within this library, but adding opt-in options is a very well established pattern in this entire ecosystem; build the 80% use cases and make the 20% possible. That's what an inclusive community looks like.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vincepri
Once this PR has been reviewed and has the lgtm label, please ask for approval from lavalamp by writing /assign @lavalamp in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2022
@vincepri vincepri force-pushed the shared-informer-keyfn branch 2 times, most recently from 72fb76e to cc6d113 Compare December 12, 2022 00:59
Signed-off-by: Vince Prignano <vince@prigna.com>
@liggitt
Copy link
Member

liggitt commented Dec 13, 2022

this change proposed doesn't affect k/k in any way and I'll happily provide support for the code contributed.

Things consuming informers as implemented by client-go can currently assume name/namespace are sufficient to distinguish one object in the cache from another. Making the key function indeterminate breaks that assumption. It becomes possible to pass an informer into existing controllers which will make their handling of objects retrieved from the cache incorrect. That doesn't seem like a neutral change.

We're not looking to add support for multi-cluster within this library, but adding opt-in options is a very well established pattern in this entire ecosystem; build the 80% use cases and make the 20% possible.

Before adding flexibility/customization, it's reasonable to ask what the benefits and costs are. The costs appear to be less certainty/clarity around what makes an object in the cache unique, and possibility for resulting bugs/security issues in controllers. If the primary benefit is a newly enabled use case, evaluating whether that use case is one we want to encourage is reasonable.

I don't think it's a good idea to expand this implementation to be able to emit objects that aren't sufficiently identified by name/namespace, but it's worth discussing. If that use case was desired, I'd look for an approach that protected against confused controller bugs, and also probably reconsidered some of the Store interface methods (like Replace(list []interface{}, resourceVersion string), which is pretty weird/inefficient for a cache containing objects across clusters being populated by independent list/watches).

@vincepri
Copy link
Member Author

vincepri commented Dec 14, 2022

Things consuming informers as implemented by client-go can currently assume name/namespace are sufficient to distinguish one object in the cache from another.

That makes sense and it's a good point, I'm wondering though if the Transform function pose similar risks?

Making the key function indeterminate breaks that assumption.

Just a small clarification here, the defaults would stay the same, it'd be up to implementers to potentially override behaviors of Informers using in specific clients, if they want to.

As mentioned in the PR description, I'm happy to consider different paths/approaches and ultimately come together to an outcome that helps the community at large, some other ideas could be:

  • Add a different constructor altogether that supports custom KeyFunc.
  • Add a new interface (e.g. KeyedSharedIndexInformer, or similar) that expands the SharedIndexInformer interface with another method for implementations that support custom key functions.

These would guarantee that existing implementations remain untouched in terms of key function internals and we'd still expose the ability to customize it, if subprojects might find that necessary. Does that sound like a good path forward?

I don't think it's a good idea to expand this implementation to be able to emit objects that aren't sufficiently identified by name/namespace, but it's worth discussing. If that use case was desired, I'd look for an approach that protected against confused controller bugs, and also probably reconsidered some of the Store interface methods (like Replace(list []interface{}, resourceVersion string), which is pretty weird/inefficient for a cache containing objects across clusters being populated by independent list/watches).

A possible approach we can consider is the complete decoupling of Controller Runtime from client-go, although to achieve that we still need changes within this library to avoid duplicating the entire client-go package or repository. What I'm personally trying to avoid is create division and confusion within Kubernetes and how people build controllers.

A revisit of the Store interface and potentially exposing other internals of the shared informers would lead in the way of more sharing of code, is that something we can explore as part of this effort?

@vincepri
Copy link
Member Author

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vincepri vincepri deleted the shared-informer-keyfn branch December 14, 2022 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants