Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mTLS enablement, SPIRE server and agent installation #23806

Closed
Tracked by #22215
youngnick opened this issue Feb 16, 2023 · 2 comments · Fixed by #24765
Closed
Tracked by #22215

mTLS enablement, SPIRE server and agent installation #23806

youngnick opened this issue Feb 16, 2023 · 2 comments · Fixed by #24765
Assignees
Labels
area/cli Impacts the command line interface of any command in the repository. area/helm Impacts helm charts and user deployment experience area/servicemesh GH issues or PRs regarding servicemesh kind/feature This introduces new functionality. sig/agent Cilium agent related.

Comments

@youngnick
Copy link
Contributor

youngnick commented Feb 16, 2023

This issue covers figuring out how to handle enabling mTLS, and then installing the SPIRE server and per-node agents.

We'll need to ensure that we support both Helm and cilium-cli, and also that we ensure that you can't turn this on in cases where it will conflict with other features.

@youngnick youngnick self-assigned this Feb 16, 2023
@youngnick youngnick added kind/feature This introduces new functionality. area/cli Impacts the command line interface of any command in the repository. area/helm Impacts helm charts and user deployment experience sig/agent Cilium agent related. area/servicemesh GH issues or PRs regarding servicemesh labels Feb 16, 2023
@youngnick youngnick changed the title SPIRE server and agent installation mTLS enablement, SPIRE server and agent installation Feb 16, 2023
@youngnick
Copy link
Contributor Author

youngnick commented Apr 4, 2023

Current thoughts on this:

The semi-official SPIRE Helm charts will need a lot of modification for our use case - I think it's better to maintain our own chart based on their principles, because of how we are using the DelegatedIdentity API, our install will necessarily be different to the standard ones.

We also need to ensure that we're not stopping ourselves from being able to work with bring-your-own-SPIRE in the future (although that is not in scope for the initial release.)

I think we should do a two-pronged approach:

  • Firstly, we provide a chart that sets up everything we need, in the way we need it, and allow config of lots of details. Practically, this will most likely need to be a subchart of the Cilium chart.
  • Secondly, we explain what we're doing differently to a standard SPIRE installation, and how the Cilium install must be configured to work like we expect. (Including expected identities etc.) Probably the biggest difference is that we don't need the SPIRE controller-manager to automatically create Service Account identities - we're having Cilium handle that instead. I don't think it will hurt to have the controller-manager running though.

Things that we can be pretty sure we need to have be configurable:

  • trust-domain: the domain part of the SPIFFE URL. Default: spiffe.cilium - it shouldn't be an actual domain by default.
  • socket paths: There are the agent and agent admin sockets. They should be bound into a host directory that's also shared with the Cilium Agent.
    • agent-socket: Default /run/spire/sockets/agent/agent.sock
    • admin-socket: Default /run/spire/sockets/admin.sock (The Admin socket can't be in a subdirectory of the Agent one.)
  • Default identities (Open question: should we allow configuring the selectors for these? Also, we could specify these using either pieces of the URL (just the path) or with a template that includes the trustdomain or something.):
    • spiffe://spiffe.cilium.io/ns/spire/sa/spire-agent: Node identity for the Spire agent, so it can parent other identities.
    • spiffe://spiffe.cilium.io/cilium-agent: Cilium Agent identity. This is plumbed into the SPIRE agent so that the Cilium Agent will have access to the admin socket (required for the DelegatedIdentity API). (Configured in the authorized_delegates field in the agent field of the SPIRE agent config file.)
    • spiffe://spiffe.cilium.io/cilium-operator: Cilium Operator identity. This is configured in the SPIRE server config file, so that the cilium-operator can create SPIRE entries across the network.(Configured in the admin_ids section inside the server section of the SPIRE server config file.)
  • SPIRE infra installation namespace.
  • SPIRE Server Service name inside the infra namespace.

Things that we expect but probably don't need to have configurable:

  • k8s_psat node attestor, using the spire:spire-agent service account for registering new nodes.
  • k8s workload attestor.
  • spire-agent daemonset neeeds a projected service account token added as a volume.
  • We don't run the SPIRE controller-manager, the Cilium operator will do that instead.

@youngnick
Copy link
Contributor Author

For the Helm chart, I think we should move from the existing config:

auth:
  mTLS:
    # -- Enable mtls-spiffe authentication method in CiliumNetworkPolicy
    enabled: false
    # -- SPIRE socket path where the SPIRE delegated api agent is listening
    spireAdminSocketPath: /run/spire/sockets/admin.sock
    # -- SPIFFE trust domain to use for fetching certificates
    spiffeTrustDomain: spiffe.cilium.io
    # -- port on the agent which is used to mTLS handshakes on
    port: 4250

To this one:

auth:
  mTLS:
    # -- Enable mtls authentication method in CiliumNetworkPolicy
    enabled: true
    # -- port on the agent which is used to mTLS handshakes on
    port: 4250
    # Settings for SPIRE
    spire:
      # Settings to control the SPIRE installation subchart
      install:
        # Note that this will only take effect if auth.mTLS.enabled is true _and_ authType is spire
        enabled: false
        namespace: cilium-spire
      # -- SPIFFE trust domain to use for fetching certificates
      spiffeTrustDomain: spiffe.cilium.io
      # -- SPIRE socket path where the SPIRE delegated api agent is listening
      adminSocketPath: /run/spire/sockets/admin.sock
      # -- SPIRE socket path where the SPIRE workload agent is listening
      # Applies to both the Cilium Agent and Operator
      agentSocketPath: /run/spire/sockets/agent/agent.sock
      # Identity paths for the SPIFFE URLs, without prependended /
      # SPIFFE URLs will look like:
      # spiffe://trustdomain/identity
      identities:
        cilium-agent: "cilium-agent"
        cilium-operator: "cilium-operator"

Notable things here:

  • mTLS authentication will default to "on", so the fields in network policy will do something. This will let devs or users test that the datapath is working with the "always allow" and "always deny" auth methods.
  • mtls-spiffe won't work without a SPIRE install; setting auth.mTLS.spire.install.enabled: true will have Cilium install the SPIRE server for you.

We need to speak to other folks to decide if a subchart or just a directory of templates is the better option for the Cilium SPIRE install; the upstream chart is not usable without a lot of editing, we would be effectively maintaining our own chart anyway. My preference is probably for a subchart that is just stored within the Cilium chart folder, that allows some segmentation of this (pretty substantial) install away from the bits that are Cilium proper. However, Cilium has previously had subcharts and moved away from them, so we should check why that was and if the reasons for that change apply here and now (Helm 3 has changed a bunch of things).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cli Impacts the command line interface of any command in the repository. area/helm Impacts helm charts and user deployment experience area/servicemesh GH issues or PRs regarding servicemesh kind/feature This introduces new functionality. sig/agent Cilium agent related.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants