-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/admission webhook improvements #577
Feature/admission webhook improvements #577
Conversation
Hey @kesselborn thanks for putting this together. Something like this was definitely on my roadmap. This will probably take some time to review, but I do have some questions. The thing that originally prevented me from doing something like this in the moose example is that it requires the operator itself to have RBAC permissions to make these configuration changes, which I don't believe is best practice. Does this require those elevated permissions? What are your thoughts on this? I had planned to implement some sort of tooling like a Cargo plugin to deploy the operator and make this configuration. |
yes: if you want to do it the way I added it to the moose example, you would need elevated permissions ... but as it is only an example, I think that's ok + you can see how you can use those functions. That's why I added the Type With this approach, you have total flexibility when you provide an operator:
... the quick win is that working code and documentation don't diverge. While I like cargo plugins, this would make Rust a dependency for people who want to use the operator and I think Rust is not (yet) very common for many people. As an Operator-Provider, I would like people to be able to use the operator without the need to use something other than the normal Kubernetes-Tooling |
I am currently thinking, whether it would make sense to extract the derive macro into it's own little crate. It's not really related to krator but could be used independently as well. I'll think about it and probably adjust the pull request accordingly |
4c5e48d
to
1a03183
Compare
This macro provides convinience methods for easily creating necessary Kubernetes resources for running a admission webhook: - a mutating webhook configuration - a service specification for the webhook - a secret containing a certificate and a private key for the webhook - add krator_derive tests to justfile
- in admission.rs: add new type WebhookResources that provides some convinience methods for webhook-resources-tuple (Service, Secret, MutatingWebhookConfiguration) that gets returned by the derive macro - add new method async fn admission_hook_tls(&self) -> anyhow::Result<krator::admission::AdmissionTLS> to the operator trait. This method has to be implemented when the admission-webhook feature is enabled and needs to return an AdmissionTLS struct containing a certificate and a private key for the webhook web server. This way, passing this information must not rely on files whose path is given via env vars. Furthermore, add convenience methods that allow to convert a Kubernetes-Secret to a AdmissionTLS struct, so saving the TLS information in a Kubernetes-Secret is an easy use case - add brief README to the moose example - adjust Moose example so that it uses the new admission-webhook macro from `krator_derive`; add functionality to automatically install the CRD and the necessary admission webhook resources on startup - add build and compilation steps from krator to `justfile` so breaking changes in theses crates make the build fail
1a03183
to
7c47872
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good, albeit very complex. IMO the best practice for this would be to emit manifests / kustomize + Makefile in a similar fashion to kubebuilder
, with auto-apply being an opt-in feature flag. I do think this gets us closer to that so I'm in favor of merging after some of my comments are addressed.
@thomastaylor312 I think you should take a look at this since it touches a lot of stuff in krator-derive
. I'm not sure how common it is to have so many dependencies in a derive crate but it seems like an antipattern.
@kflansburg sorry for the late reploy and thanks for taking the time to review this rather big PR -- very much appreciated. Before fixing the comments I would like to discuss again, if it makes sense to make this part of I think extracting this as an individual crate is probably cleaner? What do you think? Regarding pulling in a lot of additional dependencies via a proc macro: I was wondering this myself as well, and would be interested in @thomastaylor312 opinion here as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good. Thank you for adding tests and cleaning up the derive stuff.! Insofar as I understand, I don't think we need to import any dependencies
for the derive macros to work, but I could be wrong
/// the certificate from the given service and the service of the given service as configuration | ||
/// | ||
/// Example | ||
/// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great example! The one problem (like in a previous comment) that this code cannot depend on krator
or we get a weird circular dependency issue. If we can get it so that isn't needed in dev-dependencies
, this will work fine. If it does need a dependency on krator
, we'll want to move this example there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really a problem in dev-dependencies
as well? I got problems when having circular dependencies in dependencies
but no complains in dev-dependencies
... but that's perhaps just a cheap works for me
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are ok for now. I'll do some double checking later
@kesselborn Given that |
@kflansburg regarding the splitting out into a separate crate: the transition macro creates a impl krator::TransitionTo block ... doesn't that make it kind of bound to krator? |
I was under the assumption that it made sense to require dependencies of the code that gets produced by the macro so that they are available as transient dependencies
To clarify, it doesn't depend on krator in the Cargo dependency sense. I think my comment here explains that a bit more. However, I also think for now this works here, but as people start wanting to use it for things outside of krator, we could split it out to a new crate |
Adds three possible attributes for the admission-webhook feature: - secret - service - admission_webhook_config This way, users can opt-out if they don't need all functions. All features are documented and the resulting dependencies documented
@kflansburg / @thomastaylor312 : can we move this forward somehow? I know, it's quite a lot of comments and therefore some work to review. I resolved all comments apart from the ones, where I had a follow-up |
Apologies, I've been very busy this last week. I will try to take a final look at this soon. |
Hey so I'm trying to run the examples, but I'm running into an issue with the admission webhook being unreachable:
I'm trying to follow the guide as directed in the logs:
This is with a KinD cluster, in case that matters. I'll try with AKS and see if my results differ. |
Oddly I receive a different result with AKS:
|
Things work just fine without the admission webhook feature enabled.
|
hey @bacongobbler : do you run the moose example outside the cluster or do you deploy it into the cluster? If you run it outside the cluster, you need to do some "hacking", so that the example can process the mutation requests ... if you look further up in the log it should have respective instructions:
I tested locally using kind as well ... if you test it with AKS, you would have to make sure, that the Kubernets-Cluster can talk to your process |
so, just to give more context: these admission-webhooks are really just that: webhooks. So, in order to provide a webhook, the example app provides an https-endpoint where Kubernetes will post If you use AKS, you should deploy the moose example into the cluster setting the appropriate labels (they show up in the log as well), as this script will most probably not work. |
Got it. Thanks for the context. However, even with the script it still doesn't seem to be working. I'm not sure if the admission webhook is picking up on the new endpoint.
Even if it did work, I don't think hard-coding to 172.17.0.1 will work for WSL2 as the network bridge from KinD to the host network is different for that operating system. We documented this process in the HOWTO guide for Krustlet if that helps. For reference: the patch succeeded, and the endpoint exists.
|
As a note: I don't think my comments here are blocking this from going through. I'm just providing my (relatively naive) feedback here to inform you how someone new to the feature is experimenting. Doc/tooling fixes shouldn't block this from going through IMO. That being said, the concerns raised earlier from @kflansburg about requiring escalated privileges to register an admission webhook seems concerning. That seems like a potential blocker. If one were to find a security flaw in an operator's admission webhook, the attacker now would have escalated cluster privileges. That seems like a very concerning attack vector. How do other operator frameworks register admission webhooks? Are there ways to avoid the need for privilege escalation? If not, are there ways to mitigate an attack? |
yeah ... testing this outside the cluster is a work around to speed up development and there will never be a really good solution for it. I just tried to reach parity with the existing scripts in the current example code. The correct way to test this is to deploy the moose operator into a Kubernetes cluster. Regarding the security concerns: this is an example operator which is optimized for ease of use. You only need more privileges if you want to install necessary resources with the operator binary -- but that's not necessary. There are actually specific functions that will print out the manifests for you, so you can install the resources with Letting the operator produce these manifests is something I like as the operator and the documentation are not diverging -- you can always create the manifests that fit your operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kesselborn We have discussed and believe that it is not a good idea to include this security vulnerability as part of the demo, as it will encourage bad behavior. If you can change it to generate the YAML for the user to manually apply, and update the README, then I will approve. If you really want to keep this functionality, then I would ask that this specific behavior is gated behind a specific feature with danger
in the name.
@bacongobbler Were you able to get the Endpoint working? Was the issue with the gateway being used and WSL?
@thomastaylor312 You has also requested changes. Can you indicate whether those have been resolved?
Thanks for the ping. In the end I was able to get it to work - just a quick misunderstanding of the networking setup on my end. |
9da7a4d
to
cd74b85
Compare
@kflansburg I do agree that this sets a bad precedent. I adjusted the example accordingly and added a comment to to the apply functions to note, that they should not be called in the operator. |
Providing some additional context as discussed offline... We (Taylor and I) are maintainers on several large open source projects. Helm being one of them. In Helm 2, our stance was to provide a permissive default configuration. This allowed first-time users to start experimenting with Helm and Kubernetes without having to dive headfirst into the security controls. Unfortunately, this permissive configuration could grant a user a broad range of permissions they weren't intended to have. And for the most part users ignored the security documentation, instead choosing to criticize the project for "not doing the right thing" by default. Learning from our mistakes, it's best to demonstrate "best practices" as the default rather than skewing towards ease-of-use. Especially when "ease-of-use" means skirting certain security controls. Another great example of providing insecure defaults can be shown in this article for MongoDB instances being hacked. The attackers hacked instances that did not have password-protected admin accounts. As in, they attacked MongoDB instances using the default settings. As the Moose demonstration is the de-facto example for krator, I would not be surprised that many users will copy the example and apply it to their projects. Hence the concern. |
All comments were addressed, leaving final review for @kflansburg
@bacongobbler thanks for the explanation -- I absolutely understand those points and do agree that this would have set a bad example :)
|
as discussed: this doesn't reflect best practice due to necessary permissions
cd74b85
to
ac54528
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your patience!
This PR adds some conveniences around the admission webhook -- everytime I needed to implement a webhook, I was quite annoyed about the many moving parts that have to be coordinated.
This looks like a lot of code but there is quite a bit of documentation and test code included + the derive code is more or less trivial but annoying to implement for each project individually.
It's probably easiest to look at each commit indivually as they are split up between
krator
andkrator_derive
.This is still blocked until #574 (or the bug it that one tries to fix) is fixed, but perhaps people interested can already have a look at it.
/cc @kflansburg