-
Notifications
You must be signed in to change notification settings - Fork 22
Conversation
ddfc224
to
302a840
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks pretty good. some comments inline.
@@ -0,0 +1,23 @@ | |||
# Patterns to ignore when building packages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
charts/spire here isn't just spire. its a whole bunch of stuff k8s specific. so this chart should probably be called something like:
kube-spires-stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a helm chart anyhow k8s specific? I find it weird to name a helm chart that is specifically for kubernetes a kubernetes helm-chart.
I could rename it to spire-stack, not sure if that makes it better.
The reason it is subcharts is to make maintenance a lot easier. In the end there are still lots of interdependencies which cause difficulties if we have them as root level independent charts. E.g. first release a new spiffe-csi driver to be able to update the usage in the other charts. Moving the charts to root level I would only do at a point where we would like to support all those exotic usecases that can't work with this more opiniated basic chart where we can and will add more configurability like was existing in the HPE chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no helm deployment isn't k8s specific. spire-server could be used for agents not running in Kubernetes at all. I do have intention to use it that way. Think dedicated extra spire-server for HPC, edge computing, etc). Its reasonable to use helm to deploy a server stack (spire-server, oidc stuff) without it being tightly integrated into the Kubernetes hosting it. In some cases, multiple spire server stack instances per single k8s instance. But a super easy to deploy, k8s integrated stack is useful too. Hense the recommendation to have the moniker something like kube-spire-stack parallel in principal to https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack.
so something like this makes sense to me:
kube-spire-stack:
csi-driver
spire-agent
spire-server-stack:
spire-server
oidc
or maybe there is no reason to have a server-stack and just combine all that stuff into the spire-server chart. not sure.
But deploying spire-server without the rest is desirable too, and shouldn't need downloading kube-spire-stack and picking it apart to get to the subchart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seeing later comment, I think the spire-server-stack thing may not make sense after all and maybe just collapse that part to spire-server. The rest of the comment though I think is still ok. More discussion in a following comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using helm charts to deploy kubernetes resources on non kubernetes? Is that what you are saying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. Is the oidc service using the server or the agent socket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give me a few... I'm going to rereview that code. I think I missed something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I think I found the source of my confusion. Commented in the other thread about it.
I still think breaking the chart up into individual charts with deps would help with folks that don't want to deploy the whole thing (like me). It would make release harder, so we could delay that breakup til later. I'd be willing to work on that after merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can choose to not deploy it. No need to move the charts. As said there are flags to disable it. And there are subcharts because it is much easier to release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all the k8s integration stuff is currently disableable. Also, it makes it difficult on those wanting to just deploy spire, and not the whole k8s spire stack to do so. Lots of variables need customized.
Yes, its more easy to release one uberchart. No, I don't think it suits all the users needs we have. I'm willing to help with that part.
{{- end }} | ||
|
||
|
||
{{- define "spiffe-csi-driver.image" -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this is working, it would be hard to override image.registry across all the charts in a release since there is no global.image.repository. k8s spire on an isolated network would be quite useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the downside of having separate charts, however with core yaml capabilities this can be easily worked around.
spire-agent:
image:
repository: &repo ghcr.io/marcofranssen
spire-server:
image:
repository: *repo
spiffe-csi:
image:
repository: *repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can work, but most users of yaml don't know those tricks. There's an easier way though
With separate charts, helm globals can be used for stuff like repository server:
https://helm.sh/docs/chart_template_guide/subcharts_and_globals/#global-chart-values
top level values.yaml
global:
image:
# set this to override the default repository used for all containers
# repository: xxx
and in the templates (ala http://masterminds.github.io/sprig/defaults.html):
{{ default .Values.image.repository .Values.globals.image.repository }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will look into those globals. Don't want to give up on subcharts that will complicate things a lot from release management perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a project need, right now, to deploy a spire server, in k8s, without k8s integration. If we can get this merged as is, and then I can contribute bits to break it up into the subcharts with deps I'm ok working on that in a different pr. I need the capability though.
charts/spire/charts/spiffe-oidc-discovery-provider/templates/tests/test-connection.yaml
Show resolved
Hide resolved
- name: spire-oidc-sockets | ||
hostPath: | ||
path: /run/spire/oidc-sockets | ||
type: DirectoryOrCreate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use the csi driver instead?
maybe /run/spire/oidc-sockets should be configurable if not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't found an example, but that would be really awesome. @faisal-memon any idea if that would be feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure either. We can open that as an issue and investigate it after this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost agree, but anytime hostPath is used when not really needed, it gives me pause.
Looking through the examples here:
https://github.com/spiffe/spire-tutorials/tree/main/k8s/oidc-vault/k8s
it looks like it must use the server socket, not the agent socket like I was hoping, so it probably wouldn't work with the csi driver. That also implies the oidc-discovery-provider and spire-server pods must be coscheduled on a node...
And, in the interests of reducing the need for more permissive Pod Security Admission control, I'd recommend we move spiffe-oidc-discovery-provider as an optional sidecar to the spire-server pod like they do in the above spire-tutorial and like is done with the k8s controller manager. That way, an empytdir can be used to pass the socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if you integrate the oidc provider via the agent socket which we specifically choose to do for scaling reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss read this part. I was thinking this was a socket to talk to the spire server, not the communications between nginx and oidc server. That being the case,
Shouldn't this just be an emtpyDir?
There any reason to push it to the host? I could see it causing problems with multiple of the same pod on the same host too unless emptyDir.
If switched to emptyDir, I don't think it needs to be configurable anymore, and the Pod Security Security Admission could probably support pod-security.kubernetes.io/restricted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emptyDir
causes problems with the csi driver and restarts: https://github.com/spiffe/spiffe-csi#:~:text=An%20emptyDir%20volume%20cannot%20be%20used%20since%20the%20backing%20directory%20would%20be%20removed%20if%20the%20SPIFFE%20CSI%20Driver%20pod%20is%20restarted%2C%20invalidating%20the%20mount%20into%20workload%20containers.
Something like that may be the case here and why we have to use hostPath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not applicable here. The comment there:
An emptyDir volume cannot be used since the backing directory would be removed if the SPIFFE CSI Driver pod is restarted, invalidating the mount into workload containers.
Is talking about trying to share a directory between the csidriver pod and the workload pod. cant do that with emptydir. emptydir follows the lifecycle of the pod its attached to.
In this case, the unix socket is shared strictly within the pod. I've never seen that not work between containers in the same pod. I've run pods like that in production routinely for years without issue.
Id be very curious if you can find a reason why its hostPath. I'm not aware of any. Otherwise, I think it was probably done by mistake.
@@ -0,0 +1,96 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crds are k8s specific, not to a spire-server. some users may want to install spire-server without k8s support, or even multiple instances. This might still work with this arangement, but maybe better belongs with the k8s specific charts, like where these crd's are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CRDs are specifically for the spire-controller-manager which is part of this Helm Chart, that is why they need to be in this chart. Spire-controller-manager can only be installed in the same pod as spire-server due to the server-socket that needs to be accessed at the moment. Also see the spire-controller-manager migration Guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment about wanting to deploy multiple spire-servers without the k8s integrations on all of them. Just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really wonder how you install a helm chart which is about kubernetes resources on a non kubernetes environment? During the last spire community meeting we also agreed we would focus this chart on basic Kubernetes. Maybe it makes sense if you can explain how you use a helm chart on a non kubernetes environment, I haven't seen that before. Is that even possible to deploy kubernetes manifests on a non kubernetes system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not quite what I meant. spire has a client / server model. You can use helm to deploy a server into a k8s cluster, an use it entirely from clients outside of the cluster. This is beneficial to some projects even.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still don't get how that relates to this helm chart.
Those clients will not be installed via this helm chart right?
This helm chart die specific things for the reason of scaling as explained earlier. If you don't want agents in every node you can still do that with a node selector, to only deploy it on the node that has the oidc component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I want a way to deploy one or more instances of the spire-server chart, without any k8s integration bits, on a k8s cluster and expose it out of the cluster for use with agents installed on hosts that are not part of the k8s cluster.
Its basically the spire-server chart and the oidc chart bits but without the agent, csi driver and the k8s controller manager.
@@ -0,0 +1,59 @@ | |||
{{- if eq (.Values.k8sWorkloadRegistrar.enabled | toString) "true" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. earlier comment about crds is because of this. probably the missing files that should be up in the k8s-workload-registrar chart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the workload registrar, which will not get any updates. That will simplify this chart so we don't have to manage the deprecated components anymore, we don't have any users of the chart, so we don't have to worry about backwards compatibility at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That chart shouldn't be in there, went wrong with my cherry picking. Is corrected now. k8s workload registrar has to run in the same pod as spire-server for same reasons as spire-controller-manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh. I didn't realize the spire-controller-manager and spire-server had to be in the same pod... hmm....
Ok, how about this....
kube-spire-stack:
crds
csi
agent
spire-server
spire-server values default spireServer.k8sControllerManager.enabled to off by default
kube-spire-stack values set spireServer.k8sControllerManager to on by default
Then, when deploying spire-server without integrations, it would do the right thing and when you want a nicely integrated into k8s option, it also does the right thing and users by and large wouldn't know that it even exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still prefer the crds in the chart that actually depends on them or can deal with them. Which is common practice. If there is no controller manager it doesn't make sense to install CRDs.
CRDs slow down k8s api calls. We have clusters with crossplane that have hundreds of CRDs, once you feel that pain you will understand you don't want to install CRDs for the sake of installing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crds in helm IMO are really really broken in general. They have changed behavior over time, and have had bugs related to them off and on over the years. I personally dont trust helm to manage them so am always looking for a way to turn it off.
I've seen lots of projects choose one of:
- crds subdir
- templated with disable value
- helm job hook that loads the crds
Each has benefits/drawbacks :/
Some drawbacks (not exhaustive)
- doesnt ever get upgrades. not controllable via values.
- can be messy with multiple charts thinking they own it. helm delete can be very dangerous. can accidentally downgrade.
- crds are an opaque thing then so hard to audit. complicated config. jobs can have issues such as slowing down pipelines, fail for other reasons, need customization to specify resource needs on a prod cluster, etc etc etc.
if the spire-server has optionally all the other k8s server stuff in it, then having crds in it kind of makes sense. but also seems odd that if you install spire-server in non k8s mode you still get crds by default. crd deployment type 1 can't really handle that. 2 and 3 could with appropriate values.
I don't really have a good path forward suggestion. just a bunch of sub optimal paths. Others should weigh in.
302a840
to
ffb6e46
Compare
6cd4502
to
1f9f770
Compare
To enable Projected Service Account Tokens on Docker for Mac/Windows run the following | ||
command to SSH into the Docker Desktop K8s VM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is enabled by default now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone able to confirm this?
spiffeIDTemplate: spiffe://{{ .TrustDomain }}/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }} | ||
podSelector: {} | ||
# matchLabels: | ||
# spiffe.io/spiffe-id: "true" | ||
namespaceSelector: {} | ||
# matchLabels: | ||
# spiffe.io/spiffe-id: "true" | ||
dnsNameTemplates: [] | ||
# - '{{ index .PodMeta.Labels "app.kubernetes.io/name" }}.{{ .PodMeta.Namespace }}.svc.cluster.local' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this layout. We can have multiple cluster spiffe ids. so we may need to change this to an array later.
1444bb3
to
3b84f78
Compare
key: "" | ||
bundle: "" | ||
|
||
controllerManager: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tempted to write a controller manager like thing for syncing other things to a spire. Should we update this to be:
k8sControllerManager:
to let there be room for other things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the spire-controller-manager, to not duplicate naming within the chart it is controllerManager.
Yours could hopefully be outside of the spire-server chart as a separate chart. This one hopefully as well as soon the spire-controller-manager doesn't depend anymore on the spire-server socket.
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
e.g. if deploying with spire-65de556ac Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
e.g. if deploying with spire-65de556ac Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
…ler-manager setup Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Co-authored-by: Gert Jan Kamstra <gert.jan.kamstra@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Co-authored-by: Marco Franssen <marco.franssen@philips.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Co-authored-by: Faisal Memon <fymemon@yahoo.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
3b84f78
to
9f8c67e
Compare
Resolves #19 Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
f824da3
to
3322d37
Compare
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
3322d37
to
cb3f940
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to merge to me, we can take care of enhancements post merge of this pr. Thanks @marcofranssen and everyone else that contributed.
This PR includes the entire philips-labs history. Very useful when using
git blame
to find reasoning on why certain things are as they are in this chart.