New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service mesh: add mTLS auth method #24263
Conversation
aecb3fb
to
bde87f1
Compare
How to test this (I know you cannot wait ;) ): Enable it via Helm:
Install SPIRE: https://github.com/meyskens/cilium-spiffe-poc/tree/meyskens/cilium-mtls ( Deploy a policy to use auth, I used the connectivity test pods for this apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: auth-egress
namespace: cilium-test
spec:
endpointSelector:
matchLabels:
kind: client
egress:
- toEndpoints:
- matchLabels:
kind: echo
toPorts:
- ports:
- port: "8080"
protocol: TCP
auth:
type: "mtls-spiffe"
- toEndpoints:
- matchLabels:
"k8s:io.kubernetes.pod.namespace": "kube-system"
"k8s:k8s-app": "kube-dns"
toPorts:
- ports:
- port: "53"
protocol: ANY
rules:
dns:
- matchPattern: "*" Then set up a connection between two pods and scan the logs of the Cilium agent for logs like
Create a SPIFFE ID for the identities:
(this will not be needed once #23802 is built) Try the connection again and enjoy a brand new mTLS experience <3 |
bd51500
to
57d3468
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
One major concern I have is that I don't see any tests. I realize from reviewing that this seems to be more of integration code so unit testing probably wasn't obvious. However, I can imagine a test that can spin off two TLS clients / servers and validate the code paths without too much complication.
Otherwise, I only have code nits.
@christarazi i agree on the testing side, I have been thinking about this for a bit, it would be good to have it covered with testing on some kind of integration test in the package between the mTLS and SPIFFE package. Need to discuss however if we make it part of this PR or a separate work item. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @meyskens! I'm really impressed with how quickly you've pulled functional code together.
However, I'm in a similar boat to @christarazi here, the changes look great, but I'd like to see unit tests, particularly for mtls_authhandler.go
and certificate_provider.go
. I think just tests that exercise the code paths inside the if
statements are a great place to start, even for the internal functions. Let's lock in tests while this is a small change, and then we can have greater confidence as the changesets and assoicated testing get bigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @meyskens!
Couple of comments about context handling and locking, but overall LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great 🎉 just some small suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for API changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @meyskens! Couple of nitpicking comments but the patch LGTM 🎉
@nathanjsweet question: will the socket mount shared between the spire agent and cilium agent be a problem in the context of OpenShift? |
The SPIRE installation finalization will be part of #23806 |
/test Job 'Cilium-PR-K8s-1.16-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
This adds an mTLS auth handler to the Serice Mesh auth package. It will listen on a given port and does a mutual TLS handshake with SPIFFE IDs it received. This will assure the both sides got the needed certificates. In order to integrate with the datapath tables it also improves the SPIFFE interface to use the Cilium Numeric Identities. And convert them from and to valid SNI fields. As well as implement code to validate the URI SANS inside the certificates. Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com>
Removing the debuig line that tells which SVID it received. The delegate API will send the state of the world on every sync. In very large deployments this will make a lot of debug logs. Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com>
This adds the ability to enable mtls-spiffe support in the Helm chart, It will set the required config flags to the defaults as wel as mount the spire socket to the agent pods. Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com>
/test Job 'Cilium-PR-K8s-1.16-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work @meyskens !
CI fails seem to be a flake or not related to this PR |
/mlh new-flake Cilium-PR-K8s-1.24-kernel-5.4 👍 created #24453 |
This adds an mTLS auth handler to the Serice Mesh auth package.
It will listen on a given port and does a mutual TLS handshake with
SPIFFE IDs it received. This will assure the both sides got the needed
certificates.
In order to integrate with the datapath tables it also improves the SPIFFE
interface to use the Cilium Numeric Identities. And convert them from and
to valid SNI fields. As well as implement code to validate the URI SANS
inside the certificates.
This can be enabled in the in the Helm chart.
Fixes: #23807