Skip to content
This repository has been archived by the owner on Mar 22, 2024. It is now read-only.

Base chart as starting point #14

Merged
merged 97 commits into from
Feb 20, 2023
Merged

Base chart as starting point #14

merged 97 commits into from
Feb 20, 2023

Conversation

marcofranssen
Copy link
Contributor

@marcofranssen marcofranssen commented Feb 13, 2023

This PR includes the entire philips-labs history. Very useful when using git blame to find reasoning on why certain things are as they are in this chart.

Copy link
Contributor

@kfox1111 kfox1111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty good. some comments inline.

charts/spire/charts/k8s-workload-registrar/README.md Outdated Show resolved Hide resolved
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

charts/spire here isn't just spire. its a whole bunch of stuff k8s specific. so this chart should probably be called something like:
kube-spires-stack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't a helm chart anyhow k8s specific? I find it weird to name a helm chart that is specifically for kubernetes a kubernetes helm-chart.

I could rename it to spire-stack, not sure if that makes it better.

The reason it is subcharts is to make maintenance a lot easier. In the end there are still lots of interdependencies which cause difficulties if we have them as root level independent charts. E.g. first release a new spiffe-csi driver to be able to update the usage in the other charts. Moving the charts to root level I would only do at a point where we would like to support all those exotic usecases that can't work with this more opiniated basic chart where we can and will add more configurability like was existing in the HPE chart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no helm deployment isn't k8s specific. spire-server could be used for agents not running in Kubernetes at all. I do have intention to use it that way. Think dedicated extra spire-server for HPC, edge computing, etc). Its reasonable to use helm to deploy a server stack (spire-server, oidc stuff) without it being tightly integrated into the Kubernetes hosting it. In some cases, multiple spire server stack instances per single k8s instance. But a super easy to deploy, k8s integrated stack is useful too. Hense the recommendation to have the moniker something like kube-spire-stack parallel in principal to https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack.

so something like this makes sense to me:

kube-spire-stack:
  csi-driver
  spire-agent
  spire-server-stack:
    spire-server
    oidc

or maybe there is no reason to have a server-stack and just combine all that stuff into the spire-server chart. not sure.

But deploying spire-server without the rest is desirable too, and shouldn't need downloading kube-spire-stack and picking it apart to get to the subchart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seeing later comment, I think the spire-server-stack thing may not make sense after all and maybe just collapse that part to spire-server. The rest of the comment though I think is still ok. More discussion in a following comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using helm charts to deploy kubernetes resources on non kubernetes? Is that what you are saying?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Is the oidc service using the server or the agent socket?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give me a few... I'm going to rereview that code. I think I missed something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think I found the source of my confusion. Commented in the other thread about it.

I still think breaking the chart up into individual charts with deps would help with folks that don't want to deploy the whole thing (like me). It would make release harder, so we could delay that breakup til later. I'd be willing to work on that after merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can choose to not deploy it. No need to move the charts. As said there are flags to disable it. And there are subcharts because it is much easier to release

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all the k8s integration stuff is currently disableable. Also, it makes it difficult on those wanting to just deploy spire, and not the whole k8s spire stack to do so. Lots of variables need customized.

Yes, its more easy to release one uberchart. No, I don't think it suits all the users needs we have. I'm willing to help with that part.

{{- end }}


{{- define "spiffe-csi-driver.image" -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is working, it would be hard to override image.registry across all the charts in a release since there is no global.image.repository. k8s spire on an isolated network would be quite useful.

Copy link
Contributor Author

@marcofranssen marcofranssen Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the downside of having separate charts, however with core yaml capabilities this can be easily worked around.

spire-agent:
  image:
    repository: &repo ghcr.io/marcofranssen

spire-server:
  image:
    repository: *repo

spiffe-csi:
  image:
    repository: *repo

Copy link
Contributor

@kfox1111 kfox1111 Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can work, but most users of yaml don't know those tricks. There's an easier way though

With separate charts, helm globals can be used for stuff like repository server:
https://helm.sh/docs/chart_template_guide/subcharts_and_globals/#global-chart-values

top level values.yaml

global:
  image:
# set this to override the default repository used for all containers
#    repository: xxx

and in the templates (ala http://masterminds.github.io/sprig/defaults.html):

{{ default .Values.image.repository .Values.globals.image.repository }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look into those globals. Don't want to give up on subcharts that will complicate things a lot from release management perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a project need, right now, to deploy a spire server, in k8s, without k8s integration. If we can get this merged as is, and then I can contribute bits to break it up into the subcharts with deps I'm ok working on that in a different pr. I need the capability though.

- name: spire-oidc-sockets
hostPath:
path: /run/spire/oidc-sockets
type: DirectoryOrCreate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use the csi driver instead?

maybe /run/spire/oidc-sockets should be configurable if not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't found an example, but that would be really awesome. @faisal-memon any idea if that would be feasible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure either. We can open that as an issue and investigate it after this merges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost agree, but anytime hostPath is used when not really needed, it gives me pause.

Looking through the examples here:
https://github.com/spiffe/spire-tutorials/tree/main/k8s/oidc-vault/k8s

it looks like it must use the server socket, not the agent socket like I was hoping, so it probably wouldn't work with the csi driver. That also implies the oidc-discovery-provider and spire-server pods must be coscheduled on a node...

And, in the interests of reducing the need for more permissive Pod Security Admission control, I'd recommend we move spiffe-oidc-discovery-provider as an optional sidecar to the spire-server pod like they do in the above spire-tutorial and like is done with the k8s controller manager. That way, an empytdir can be used to pass the socket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if you integrate the oidc provider via the agent socket which we specifically choose to do for scaling reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss read this part. I was thinking this was a socket to talk to the spire server, not the communications between nginx and oidc server. That being the case,
Shouldn't this just be an emtpyDir?

There any reason to push it to the host? I could see it causing problems with multiple of the same pod on the same host too unless emptyDir.

If switched to emptyDir, I don't think it needs to be configurable anymore, and the Pod Security Security Admission could probably support pod-security.kubernetes.io/restricted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not applicable here. The comment there:

An emptyDir volume cannot be used since the backing directory would be removed if the SPIFFE CSI Driver pod is restarted, invalidating the mount into workload containers.

Is talking about trying to share a directory between the csidriver pod and the workload pod. cant do that with emptydir. emptydir follows the lifecycle of the pod its attached to.

In this case, the unix socket is shared strictly within the pod. I've never seen that not work between containers in the same pod. I've run pods like that in production routinely for years without issue.

Id be very curious if you can find a reason why its hostPath. I'm not aware of any. Otherwise, I think it was probably done by mistake.

@@ -0,0 +1,96 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crds are k8s specific, not to a spire-server. some users may want to install spire-server without k8s support, or even multiple instances. This might still work with this arangement, but maybe better belongs with the k8s specific charts, like where these crd's are used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CRDs are specifically for the spire-controller-manager which is part of this Helm Chart, that is why they need to be in this chart. Spire-controller-manager can only be installed in the same pod as spire-server due to the server-socket that needs to be accessed at the moment. Also see the spire-controller-manager migration Guide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment about wanting to deploy multiple spire-servers without the k8s integrations on all of them. Just one.

Copy link
Contributor Author

@marcofranssen marcofranssen Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really wonder how you install a helm chart which is about kubernetes resources on a non kubernetes environment? During the last spire community meeting we also agreed we would focus this chart on basic Kubernetes. Maybe it makes sense if you can explain how you use a helm chart on a non kubernetes environment, I haven't seen that before. Is that even possible to deploy kubernetes manifests on a non kubernetes system?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite what I meant. spire has a client / server model. You can use helm to deploy a server into a k8s cluster, an use it entirely from clients outside of the cluster. This is beneficial to some projects even.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't get how that relates to this helm chart.

Those clients will not be installed via this helm chart right?

This helm chart die specific things for the reason of scaling as explained earlier. If you don't want agents in every node you can still do that with a node selector, to only deploy it on the node that has the oidc component.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I want a way to deploy one or more instances of the spire-server chart, without any k8s integration bits, on a k8s cluster and expose it out of the cluster for use with agents installed on hosts that are not part of the k8s cluster.

Its basically the spire-server chart and the oidc chart bits but without the agent, csi driver and the k8s controller manager.

@@ -0,0 +1,59 @@
{{- if eq (.Values.k8sWorkloadRegistrar.enabled | toString) "true" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. earlier comment about crds is because of this. probably the missing files that should be up in the k8s-workload-registrar chart

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the workload registrar, which will not get any updates. That will simplify this chart so we don't have to manage the deprecated components anymore, we don't have any users of the chart, so we don't have to worry about backwards compatibility at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That chart shouldn't be in there, went wrong with my cherry picking. Is corrected now. k8s workload registrar has to run in the same pod as spire-server for same reasons as spire-controller-manager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. I didn't realize the spire-controller-manager and spire-server had to be in the same pod... hmm....

Ok, how about this....

kube-spire-stack:
  crds
  csi
  agent
  spire-server

spire-server values default spireServer.k8sControllerManager.enabled to off by default
kube-spire-stack values set spireServer.k8sControllerManager to on by default

Then, when deploying spire-server without integrations, it would do the right thing and when you want a nicely integrated into k8s option, it also does the right thing and users by and large wouldn't know that it even exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still prefer the crds in the chart that actually depends on them or can deal with them. Which is common practice. If there is no controller manager it doesn't make sense to install CRDs.

CRDs slow down k8s api calls. We have clusters with crossplane that have hundreds of CRDs, once you feel that pain you will understand you don't want to install CRDs for the sake of installing them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crds in helm IMO are really really broken in general. They have changed behavior over time, and have had bugs related to them off and on over the years. I personally dont trust helm to manage them so am always looking for a way to turn it off.
I've seen lots of projects choose one of:

  1. crds subdir
  2. templated with disable value
  3. helm job hook that loads the crds

Each has benefits/drawbacks :/
Some drawbacks (not exhaustive)

  1. doesnt ever get upgrades. not controllable via values.
  2. can be messy with multiple charts thinking they own it. helm delete can be very dangerous. can accidentally downgrade.
  3. crds are an opaque thing then so hard to audit. complicated config. jobs can have issues such as slowing down pipelines, fail for other reasons, need customization to specify resource needs on a prod cluster, etc etc etc.

if the spire-server has optionally all the other k8s server stuff in it, then having crds in it kind of makes sense. but also seems odd that if you install spire-server in non k8s mode you still get crds by default. crd deployment type 1 can't really handle that. 2 and 3 could with appropriate values.

I don't really have a good path forward suggestion. just a bunch of sub optimal paths. Others should weigh in.

Comment on lines +15 to +16
To enable Projected Service Account Tokens on Docker for Mac/Windows run the following
command to SSH into the Docker Desktop K8s VM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is enabled by default now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone able to confirm this?

Comment on lines +153 to +161
spiffeIDTemplate: spiffe://{{ .TrustDomain }}/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}
podSelector: {}
# matchLabels:
# spiffe.io/spiffe-id: "true"
namespaceSelector: {}
# matchLabels:
# spiffe.io/spiffe-id: "true"
dnsNameTemplates: []
# - '{{ index .PodMeta.Labels "app.kubernetes.io/name" }}.{{ .PodMeta.Namespace }}.svc.cluster.local'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this layout. We can have multiple cluster spiffe ids. so we may need to change this to an array later.

key: ""
bundle: ""

controllerManager:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempted to write a controller manager like thing for syncing other things to a spire. Should we update this to be:
k8sControllerManager:

to let there be room for other things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the spire-controller-manager, to not duplicate naming within the chart it is controllerManager.

Yours could hopefully be outside of the spire-server chart as a separate chart. This one hopefully as well as soon the spire-controller-manager doesn't depend anymore on the spire-server socket.

marcofranssen and others added 20 commits February 18, 2023 13:04
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
e.g. if deploying with spire-65de556ac

Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
e.g. if deploying with spire-65de556ac

Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
…ler-manager setup

Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Co-authored-by: Gert Jan Kamstra <gert.jan.kamstra@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Co-authored-by: Marco Franssen <marco.franssen@philips.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Co-authored-by: Faisal Memon <fymemon@yahoo.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Resolves #19

Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Copy link
Contributor

@faisal-memon faisal-memon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge to me, we can take care of enhancements post merge of this pr. Thanks @marcofranssen and everyone else that contributed.

@marcofranssen marcofranssen merged commit 9dee09e into main Feb 20, 2023
@marcofranssen marcofranssen deleted the base-chart branch February 20, 2023 09:59
@marcofranssen marcofranssen added this to the Initial release milestone Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants