Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS parameters cannot be configured with EnvoyFilter #28996

Closed
laek3 opened this issue Nov 18, 2020 · 9 comments · Fixed by #29216
Closed

TLS parameters cannot be configured with EnvoyFilter #28996

laek3 opened this issue Nov 18, 2020 · 9 comments · Fixed by #29216

Comments

@laek3
Copy link
Contributor

laek3 commented Nov 18, 2020

Bug description
We are unable to configure TLS parameters (to control the TLS version, or set TLS ciphersuites, curves etc.) using an Envoy Filter.
Specifically, we would like to force the use of TLSv1_3 and of the CECPQ2 curve, a Post-Quantum key exchange available in BoringSSL, which works fine in stand-alone Envoy.
For that, we wrote a configuration patch in an EnvoyFilter. It should be sufficient to then apply it, but the behaviour of the function used to merge the configurations: proto.merge, is not correct. The resulting configuration is broken and essentially ignores the patch.

What we observed:
Instead of merging our config with the main one:
For the clusters: it adds it after as a new item and is not taken into account.
For the listeners: it erases part of the main config, where the merge should have been done.
(Checked using $ istioctl proxy-config cluster/listener).

We wrote a fix and tested it, both for cluster and listener’s patch in the EnvoyFilter handling Go code.

We would like to open a PR after some community discussion.

Our fix can be used to configure any TLS parameters, and can be duplicated to be used for other configurations.

[ ] Docs
[ ] Installation
[X] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[X] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure
[ ] Upgrade

Steps to reproduce the bug
We have 2 services, Service1 and Service2, deployed with sidecars. Service1 sends a simple http get request to Service2. The request goes to (the egress port of) the sidecar of Service1, which initiates a TLS connection to (the ingress port of) the sidecar of Service2. We want this TLS connection to be TLS1.3 with curve CECPQ2.

Service1 ---(http)--> SidecarOfService1 ---(https)--> SidecarOfService2 ---(http)--> Service2

The communication, as described above, is established between one sidecar to the other one with:
$ kubectl exec -it <pod name of service1> -c service1 -- curl -v http://service2.default:8050/myservice/2

Our EnvoyFilter:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: my-protocol
  namespace: default
spec:
  configPatches:
  - applyTo: FILTER_CHAIN
    match:
      listener:
        filterChain:
          filter:
            name: "envoy.transport_sockets.tls"
    patch:
      operation: MERGE
      value:
        transport_socket:
          name: envoy.transport_sockets.tls
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
            common_tls_context:
              tls_params:
                tls_maximum_protocol_version: TLSv1_3
                tls_minimum_protocol_version: TLSv1_3
                ecdh_curves:
                - "CECPQ2"
  - applyTo: CLUSTER
    match:
      context: SIDECAR_OUTBOUND
    patch:
      operation: MERGE
      value:
        transport_socket:
          name: envoy.transport_sockets.tls
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
            common_tls_context:
              tls_params:
                tls_maximum_protocol_version: TLSv1_3
                tls_minimum_protocol_version: TLSv1_3
                ecdh_curves:
                - "CECPQ2"

Version
Output of istioctl version --remote
client version: 1.7.0
control plane version: 1.7.0
data plane version: 1.7.0 (4 proxies)

Output of kubectl version --short
Client Version: v1.18.6
Server Version: v1.17.5

Related issues
#13138 (and PR #27500) allows to set a baseline TLS configuration for the entire Mesh. With our fix, the user can be very flexible on the choice of the TLS parameters, including the TLS version, the cipher suites, the curves. In addition, these parameters can be applied to a smaller scope, like specific proxies. As one example, this will allow people to require TLS 1.3 to be used everywhere, for compliance reasons

#18169 #24330
Same root cause as us: the function proto.merge

@liwenhao0810
Copy link
Contributor

@istio/wg-networking-maintainers
Please take a look at this issue to gather some thoughts on the issue and proposed fixes.

@howardjohn
Copy link
Member

For the clusters: it adds it after as a new item and is not taken into account.

Is this just the transport_socket_matches? It seems like it should work without auto-mtls. Not suggesting that as the solution, just want to make sure I understand correctly.

@howardjohn
Copy link
Member

For the listeners: it erases part of the main config, where the merge should have been done.

Looks like proto.Merge doesn't recursively merge the Any marshalled. We do have some special logic for networkfilters for this:

. The problem goes beyond just transport socket though, any field configured by Any will have the same, which is fairly common. Maybe we can look into a merge method that merges through any as well. That would solve the listener level at least. It will also probably break some people that rely on this not merging 🙂

Our backup plan is we can always add a mesh config setting for tls version/cipher suite

@ramaraochavali
Copy link
Contributor

Our backup plan is we can always add a mesh config setting for tls version/cipher suite

I am good with this if every one agrees adding an API is fine. This assumes though that it is controlled at mesh level. But in the PR @laek3 seem to indicate they want to control it per service?

@laek3
Copy link
Contributor Author

laek3 commented Dec 10, 2020

Is this just the transport_socket_matches? It seems like it should work without auto-mtls. Not suggesting that as the solution, just want to make sure I understand correctly.

Yes this config patch is written in the transport_socket_matches. Can you detail a bit more how it would be without auto-mtls?

Looks like proto.Merge doesn't recursively merge the Any marshalled

Yes, that is the idea of the PR I wrote. I work with the Any type to make the merge working

I am good with this if every one agrees adding an API is fine. This assumes though that it is controlled at mesh level. But in the PR @laek3 seem to indicate they want to control it per service?

Yes we would like to have a control at the service level

@howardjohn
Copy link
Member

Can you detail a bit more how it would be without auto-mtls?

Without automtls it is just in transport_socket which is not a list so it can be patched ok (although you probably end up with the same problem of listener where it replaces not merge)

@yaronf
Copy link

yaronf commented Dec 11, 2020

Our backup plan is we can always add a mesh config setting for tls version/cipher suite

I am good with this if every one agrees adding an API is fine. This assumes though that it is controlled at mesh level. But in the PR @laek3 seem to indicate they want to control it per service?

As @laek3 mentioned, yes, we at Intuit would like to control this setting at the service level. But for clarity, if we go with the "backup plan", would we be able to use all Envoy TLS parameters out of the box (TLS version, curves, ciphersuites etc.) or would we need to duplicate any such parameter in Istio, if we want it supported? For example, if a new ECDH curve is added in Envoy, would we need to explicitly add it to Istio?

@howardjohn
Copy link
Member

@yaronf I think that depends on the API we choose. Probably the cipher_suites would be a plain string list, so it would be future proof. We do have a similar API for gateway: https://istio.io/latest/docs/reference/config/networking/gateway/#ServerTLSSettings. The one thing missing is ecdh_curves which seems important to you

@yaronf
Copy link

yaronf commented Dec 11, 2020

Thanks for clarifying @howardjohn! ECDH curves is the most important parameter we are using for the post-quantum network hardening we are working on. Specifically CECPQ2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants