You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an operator wants to migrate from the self-signed Citadel CA to spire backed workload identities he will encounter multiple difficulties in doing so, which seems to force him to restart all workloads that cross communicate at once. Making the migration path difficult without blue/green or full cluster upgrade.
to better illustrate our findings about the current behavior for mTLS we have an experiment we ran we will show here:
(Since there is no ingress gateway we have a simpler config than the standard quickstart)
Notice we choose the exact same trust domain: cluster.local for the spire domain..
Next we make the details pod fetch itself a spire certificate:
#These are the bookinfo manifests but with the spire podselector labels and template annotations added
kubectl apply -f spire-backed/bookinfo-spired.yaml -l app=details,version=v1
We confirmed that the details pod gets issued spire backed certificate succesfully and connects to the control plane.
We see that now communication between productpage -> details is broken due to TLS error.
This is because productpage only has the old selfsigned root in the istio-root-ca-configmap trust store. and this shows in istioctl pc secret
If we now make productpage use spire:
# Make productpage use spire
kubectl apply -f spire-backed/bookinfo-spired.yaml -l app=productpage,version=v1
Now productpage reads details fine but ratings/reviews are broken (as they use old CA and are not trusted).
What's even more confusing is that the control plane is connected to correctly even though it uses the old CA.
In case you were not aware the Istiod control plane does not yet support workload identity through spire as shown in this issue: #49087
The reason this works, is that the xds proxy on each sidecar which connects to istiod on behalf of envoy, sets up a seperate root of trust on workloads, including spire backed workloads and uses the istio-root-ca-configmap:
This is inconsistent and very surprising. There should be a way to include all the old roots for all workloads to allow for spire to non-spire communication in my view if only to allow for gradual rollout.
Describe alternatives you've considered
to just install spire on all workloads and do a full cluster upgrade at once with a maintenance window or if available use a blue/green setup for this.
Perhaps force the trust bundle context somehow for workloads to always contain both the spire root and the control plane self-signed root. Unclear if current mechanisms would allow you to do so.
This would still require workload restarts to take effect presently but would at least allow for a gradual rollout.
Affected product area (please put an X in all that apply)
[ ] Ambient
[ ] Docs
[ ] Dual Stack
[x ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[x ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure
Affected features (please put an X in all that apply)
[ ] Multi Cluster
[ ] Virtual Machine
[ ] Multi Control Plane
Additional context
Tested on Istio 1.18.2
The text was updated successfully, but these errors were encountered:
When an operator wants to migrate from the self-signed Citadel CA to spire backed workload identities he will encounter multiple difficulties in doing so, which seems to force him to restart all workloads that cross communicate at once. Making the migration path difficult without blue/green or full cluster upgrade.
to better illustrate our findings about the current behavior for mTLS we have an experiment we ran we will show here:
First install standard istio and bookinfo:
Then apply the quick start spire:
(Since there is no ingress gateway we have a simpler config than the standard quickstart)
Notice we choose the exact same trust domain: cluster.local for the spire domain..
Next we make the details pod fetch itself a spire certificate:
We confirmed that the details pod gets issued spire backed certificate succesfully and connects to the control plane.
We see that now communication between productpage -> details is broken due to TLS error.
This is because productpage only has the old selfsigned root in the
istio-root-ca-configmap
trust store. and this shows inistioctl pc secret
If we now make productpage use spire:
Now productpage reads details fine but ratings/reviews are broken (as they use old CA and are not trusted).
What's even more confusing is that the control plane is connected to correctly even though it uses the old CA.
In case you were not aware the
Istiod
control plane does not yet support workload identity through spire as shown in this issue:#49087
The reason this works, is that the xds proxy on each sidecar which connects to istiod on behalf of envoy, sets up a seperate root of trust on workloads, including spire backed workloads and uses the istio-root-ca-configmap:
istio/pkg/istio-agent/agent.go
Line 623 in f30859e
https://github.com/istio/api/blob/68cdbb256ce1d970fa0a2fb4397057d165ee4732/mesh/v1alpha1/proxy.proto#L593
This is inconsistent and very surprising. There should be a way to include all the old roots for all workloads to allow for spire to non-spire communication in my view if only to allow for gradual rollout.
Describe alternatives you've considered
This would still require workload restarts to take effect presently but would at least allow for a gradual rollout.
Affected product area (please put an X in all that apply)
[ ] Ambient
[ ] Docs
[ ] Dual Stack
[x ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[x ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure
Affected features (please put an X in all that apply)
[ ] Multi Cluster
[ ] Virtual Machine
[ ] Multi Control Plane
Additional context
Tested on Istio 1.18.2
The text was updated successfully, but these errors were encountered: