Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.6 istio-proxy SDS not updating certificates SSLV3_ALERT_CERTIFICATE_EXPIRED #28050

Closed
Stono opened this issue Oct 17, 2020 · 15 comments
Closed

Comments

@Stono
Copy link
Contributor

Stono commented Oct 17, 2020

Bug description
In our 1.6 qualifying environment we have an application that is rejecting requests with upstream connect error or disconnect/reset before headers. reset reason: connection failure.

We have three (almost) identical test apps:

  • istio-test-app-1
  • istio-test-app-2
  • istio-test-app-3 (this is the broken one)

We continually send load that goes 1 -> 2 -> 3.

We rolling restart istio-test-app-1 every 10 minutes to create config churn

Screenshot 2020-10-17 at 16 09 18

The pod is 25 hours old, however as you can see here something happened at 14:35 which broke istio-test-app-3 (1 and 2 remain fine).

The istio proxy logs contain lots of sds noise:

{"level":"info","time":"2020-10-17T14:39:40.757245Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:39:40.757366Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:39:40.757625Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:39:40.757772Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:39:40.875744Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:39:40.875865Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:39:40.876142Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:39:40.876315Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:44:40.680854Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:44:40.680939Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:44:40.681104Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-44"
2020-10-17T14:44:40.681167Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-45"
{"level":"info","time":"2020-10-17T14:44:40.989664Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:44:40.989763Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:44:40.989996Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:44:40.990130Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:44:41.101797Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:44:41.101917Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:44:41.102142Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:44:41.102384Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:49:40.680826Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:49:40.680922Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:49:40.681068Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-46"
2020-10-17T14:49:40.681109Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-47"
{"level":"info","time":"2020-10-17T14:49:40.853145Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:49:40.853301Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:49:40.853587Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:49:40.853768Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:49:40.917526Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:49:40.917640Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:49:40.917876Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:49:40.918032Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:54:40.680797Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:54:40.680866Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:54:40.680941Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-48"
2020-10-17T14:54:40.681040Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-49"
{"level":"info","time":"2020-10-17T14:54:40.758052Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:54:40.758157Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:54:40.758426Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:54:40.758584Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:54:41.127609Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:54:41.127721Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:54:41.127964Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:54:41.128115Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:59:40.680900Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:59:40.680941Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:59:40.681091Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-50"
2020-10-17T14:59:40.681147Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-51"
{"level":"info","time":"2020-10-17T14:59:40.795179Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:59:40.795280Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:59:40.795539Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:59:40.795689Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:59:41.114830Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:59:41.114952Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:59:41.115164Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:59:41.115318Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:04:40.680881Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T15:04:40.680970Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T15:04:40.681071Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-52"
2020-10-17T15:04:40.681126Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-53"
{"level":"info","time":"2020-10-17T15:04:40.980455Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:04:40.980561Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:04:40.980783Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:04:40.980906Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:04:41.180954Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:04:41.181074Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:04:41.181312Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:04:41.181612Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:09:40.680961Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T15:09:40.681152Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-54"
2020-10-17T15:09:40.681253Z	warning	envoy config	[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-55"
{"level":"info","time":"2020-10-17T15:09:40.681058Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T15:09:40.842526Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:09:40.842666Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:09:40.842911Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:09:40.843082Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:09:40.893759Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:09:40.893894Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:09:40.894106Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:09:40.894256Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}

The only difference that istio-test-app-3 has vs 1 and 2 is that it uses:

    - name: OUTPUT_CERTS
      value: /etc/certs

As per this documentation.

To write certs to this volumeMount:

  - emptyDir:
      medium: Memory
    name: istio-certs

In order to write the certs to disk (as I was experimenting with for prometheus)

Looking at the logs I see:

2020-10-17T15:15:40.611175Z     debug   envoy conn_handler      [external/envoy/source/server/connection_handler_impl.cc:411] [C59921] new connection
2020-10-17T15:15:40.612284Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59921] handshake expecting read
2020-10-17T15:15:40.612317Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59921] handshake expecting read
2020-10-17T15:15:40.612704Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:198] [C59921] handshake error: 1
2020-10-17T15:15:40.612717Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:226] [C59921] TLS error: 268436501:SSL routines:OPENSSL_internal:SSLV3_ALERT_CERTIFICATE_EXPIRED
2020-10-17T15:15:40.612723Z     debug   envoy connection        [external/envoy/source/common/network/connection_impl.cc:200] [C59921] closing socket: 0
2020-10-17T15:15:40.612752Z     debug   envoy conn_handler      [external/envoy/source/server/connection_handler_impl.cc:111] [C59921] adding to cleanup list
2020-10-17T15:15:40.679479Z     debug   envoy filter    [external/envoy/source/extensions/filters/listener/original_dst/original_dst.cc:18] original_dst: New connection accepted
2020-10-17T15:15:40.679512Z     debug   envoy filter    [external/envoy/source/extensions/filters/listener/tls_inspector/tls_inspector.cc:78] tls inspector: new connection accepted
2020-10-17T15:15:40.679541Z     debug   envoy filter    [external/envoy/source/extensions/filters/listener/tls_inspector/tls_inspector.cc:148] tls:onServerName(), requestedServerName: outbound_.80_._.app.istio-test-app-3.svc.cluster.local
2020-10-17T15:15:40.679607Z     debug   envoy conn_handler      [external/envoy/source/server/connection_handler_impl.cc:411] [C59922] new connection
2020-10-17T15:15:40.680782Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59922] handshake expecting read
2020-10-17T15:15:40.680811Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59922] handshake expecting read
2020-10-17T15:15:40.681162Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:198] [C59922] handshake error: 1
2020-10-17T15:15:40.681181Z     debug   envoy connection        [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:226] [C59922] TLS error: 268436501:SSL routines:OPENSSL_internal:SSLV3_ALERT_CERTIFICATE_EXPIRED

Which makes me check:

❯ ./src/1.6.12/bin/istioctl pc secret istio-test-app-3-69d695bf8b-b256m.istio-test-app-3
RESOURCE NAME     TYPE           STATUS     VALID CERT     SERIAL NUMBER                               NOT AFTER                NOT BEFORE
default           Cert Chain     ACTIVE     true           331844299171782252960640586806792969930     2020-10-17T13:34:41Z     2020-10-16T13:34:41Z
default           Cert Chain     ACTIVE     true           331844299171782252960640586806792969930     2020-10-17T13:34:41Z     2020-10-16T13:34:41Z
ROOTCA            CA             ACTIVE     true           2623683368870835277565802163895395946       2030-10-07T07:55:53Z     2020-10-09T07:55:53Z

Which shows the certificates are indeed out of date - and expired exactly when we started seeing issues.

This made me check another app where we mount certificates in this way (prometheus) and sure enough I see its certs are out of date too:

❯ ./src/1.6.12/bin/istioctl pc secret prometheus-0.app-metrics
RESOURCE NAME     TYPE           STATUS     VALID CERT     SERIAL NUMBER                              NOT AFTER                NOT BEFORE
default           Cert Chain     ACTIVE     true           88593072017091452019065727082748990393     2020-10-16T18:43:59Z     2020-10-15T18:43:59Z
default           Cert Chain     ACTIVE     true           88593072017091452019065727082748990393     2020-10-16T18:43:59Z     2020-10-15T18:43:59Z
ROOTCA            CA             ACTIVE     true           2623683368870835277565802163895395946      2030-10-07T07:55:53Z     2020-10-09T07:55:53Z

Restarting the pod renewed the certificates:

❯ ./src/1.6.12/bin/istioctl pc secret istio-test-app-3-69d695bf8b-87x6j.istio-test-app-3
RESOURCE NAME     TYPE           STATUS     VALID CERT     SERIAL NUMBER                               NOT AFTER                NOT BEFORE
default           Cert Chain     ACTIVE     true           168785524709398884220511848353340100457     2020-10-18T15:21:26Z     2020-10-17T15:21:26Z
default           Cert Chain     ACTIVE     true           168785524709398884220511848353340100457     2020-10-18T15:21:26Z     2020-10-17T15:21:26Z
ROOTCA            CA             ACTIVE     true           2623683368870835277565802163895395946       2030-10-07T07:55:53Z     2020-10-09T07:55:53Z

[ ] Docs
[ ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[x] Security
[ ] Test and Release
[x] User Experience
[ ] Developer Infrastructure

Expected behavior
SDS certs to be updated correctly when using OUTPUT_CERTS

Steps to reproduce the bug
No idea

Version (include the output of istioctl version --remote and kubectl version --short and helm version if you used Helm)
1.6.12

How was Istio installed?
Helm

Environment where bug was observed (cloud vendor, OS, etc)
GKE

@Stono
Copy link
Contributor Author

Stono commented Oct 17, 2020

cc. @howardjohn @mandarjog 🤷

@mandarjog
Copy link
Contributor

@JimmyCYJ @myidpt

@mandarjog
Copy link
Contributor

@Stono 1.6 updates will EOL when 1.8 releases in mid November. I am wondering if you considered moving directly to 1.7 instead to stay in the window?

@Stono
Copy link
Contributor Author

Stono commented Oct 17, 2020

Definitely not! If we can't do sequential version updates (1.5 -> 1.6 -> 1.7) which is the approved upgrade route maintaining backwards compatibility, I have no confidence in an untrodden and untested path of 1.5 -> 1.7 working.

We decided to stay at least one minor version behind because of the painful track record of issues and breaking changes with new releases. After the nightmare experience of getting onto 1.5 (telemetry v2 etc), we decided to leave it a little longer for 1.6 to stabilise.

I really believe Istio are going to alienate any serious users with this overly rapid release process and T-1 support policy. There needs to be a focus on stability where you are right now and ensuring people are actually able to upgrade, not releasing more enhancements. 12 patch versions into 1.6 and I wouldn't describe it as anything other than broken and for us, actually unusable:

Two weeks of testing it and we we've not been able to get it past our most basic qualifying environments:

I've always been a massive advocate of Istio. But I'm finding it harder and harder to continue to recommend it.

1.5 -> 1.6 has easily been the worst experience yet.

@howardjohn
Copy link
Member

I am not super familiar with this option without looking into it but from the logs it looks like we are refreshing the cert from the file rather than istiod. The doc suggests this option is for VMs where the user is providing the (long expiry) cert which would explain this behavior although I am not sure it's expected since I think this is how we recommended Prometheus getting certs.

@bianpengyuan
Copy link
Contributor

Looks like sds-agent uses existence of /etc/certs as a signal of rotating cert based on file, this does not make sense to me and it should be controlled by an env var, or at least should be based on a less general path.

Could you mount /etc/istio-certs or something even specific like /etc/istio-prom-certs and try again?

But as far as I can tell, there's nothing for the OUTPUT_CERTS.

You probably want to try proxyMetadata in proxyConfig.

@howardjohn
Copy link
Member

I think I recall now the /etc/certs mounting is controlled by .Values.global.mountMtlsCerts which is for legacy citadel secret mounting or integration with external CAs that load certs via the same mechanisms. I think there is a collision with OUTPUT_CERTS being the same dir as @bianpengyuan mentioned

@Stono
Copy link
Contributor Author

Stono commented Oct 19, 2020

FYI we have .Values.global.mountMtlsCerts=false in our configuration. It might be worth driving the behaviour around rotation of certs from that setting instead rather than the existence of a directory?

And then as extra belt and braces, not allowing folks to use /etc/certs in OUTPUT_CERTS (and documentation update)?

@bianpengyuan re ISTIO_META_, where is the documentation explaining how that works? We only want it on a single proxy, not them all.

I've updated our mount as you have suggested and will update here if it worked.

@Stono
Copy link
Contributor Author

Stono commented Oct 20, 2020

I can confirm changing the directory means we're now getting certificate rotation, however that seems to have highlighted a cli bug: #28099

Do you want me to keep this issue open to track UX improvements as I think this path needs to be documented clearly as others are going to need to do it? I've documented what I did in https://karlstoney.com/2020/10/15/istio-upgrades-prometheus-sds/index.html

@howardjohn
Copy link
Member

@Stono you can do it with:

      annotations:
        proxy.istio.io/config: |
          proxyMetadata:
            OUTPUT_CERTS: /tmp/certs

I think we need to update some of the mesh config docs for this annotation though, it was new

@howardjohn
Copy link
Member

Basically anything under meshConfig.defaultConfig can be overriden per-pod

@bianpengyuan
Copy link
Contributor

@Stono Yes, please keep this issue open. I am going to add the document and will borrow some content from your blog.

FYI we have .Values.global.mountMtlsCerts=false in our configuration. It might be worth driving the behaviour around rotation of certs from that setting instead rather than the existence of a directory?

And then as extra belt and braces, not allowing folks to use /etc/certs in OUTPUT_CERTS (and documentation update)?

I will read through the logic and see how to make it not depend on presence of the cert dir.

@bianpengyuan re ISTIO_META_, where is the documentation explaining how that works? We only want it on a single proxy, not them all.

John beat me on it, the option is still hidden from doc, we should update that.

@howardjohn
Copy link
Member

I will update the doc for mesh config

@Stono
Copy link
Contributor Author

Stono commented Oct 20, 2020

Can confirm this combo of annotations does the job:

        proxy.istio.io/config: |
          proxyMetadata:
            OUTPUT_CERTS: /etc/istio-output-certs
        sidecar.istio.io/userVolume: '[{"name": "istio-certs", "emptyDir": {"medium":"Memory"}}]'
        sidecar.istio.io/userVolumeMount: '[{"name": "istio-certs", "mountPath": "/etc/istio-output-certs"}]'

@Stono
Copy link
Contributor Author

Stono commented Nov 5, 2020

Closing this issue as there's a PR merged to document it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants